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NCC Education 


About NCC Education 


NCC Education is a global provider and an awarding body of quality British education 
programmes in Business, IT and Sales and Marketing, ranging from Foundation to Masters level. 


Originally part of the National Computing Centre, we began offering IT qualifications in 1976 and 
from 1997 developed our portfolio to include IT qualifications for school children and Foundation 
and Business programmes. 


Today, NCC Education has Accredited Partners in over 45 countries, five international offices 
and academic managers worldwide employing the latest technologies for learning, assessment and 
support. NCC Education is also partnered with the British Council Education UK programme and 
quality assured by the QCA. 


Our programmes are developed to provide the required skills and knowledge to help students 
excel in their chosen career. NCC Education is recognised by universities, professional bodies 
and employers around the world. Our students can upgrade their skills on our professional 
development modular programmes, or complete a university degree in their home country or in 
the UK via the NCC Education International Degree Pathway. 


NCC Education provides students with the opportunity to gain internationally recognised UK 
qualifications by studying at one of our global network of Accredited Partners either in the 


classroom or online. 


NCC Education values its students and offers continuous support and guidance throughout their 
learning experience. 


That’s why students worldwide choose NCC Education as their route to a quality British 
education. 


NCC Education Mission 


To be a global leader in the assessment and certification of quality British education programmes 
in the major transnational education markets. 


In line with our Mission Statement, we will provide: 


e Innovative career-orientated programmes in business and IT. 
e = =The highest level of customer service in the industry. 


e Acomplete solution for our partners and students. 
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NCC Education — Quality British 
Education 


Our aim is to ensure all our students are equipped and competent in using and developing their 
IT and business skills in today’s transnational markets. As such, quality assurance is a rigorous 
requirement of all NCC Education products and partners. 


In support of this, NCC Education operates a Quality Management System that provides a 
framework of standards and procedures within which it manages and controls all its project, 
product and service activities. 


Specifically NCC Education’s quality objectives are: 


e To ensure that NCC Education Accredited Partners provide NCC Education 
programmes which meet the requirements specified in our syllabus and 
regulations. 


e To specify and maintain syllabi and regulations that meet the career development 
needs of students specialising in the IT and business sectors. 


e To ensure where appropriate that syllabi and assessments meet international 
academic standards. 


e To provide the highest quality administration and support to the international 
qualifications allied certification schemes. 
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How to Use Your Workbook 


The author has been careful to structure this workbook with your assessment in mind. Each 
chapter therefore covers an essential part of the Programming Methods Syllabus. 


This is a workbook, not a textbook and is aimed at providing an activity-based learning approach 
and is therefore extremely interactive. This workbook has a practical approach and is carefully 
structured to support and encourage you to discover and learn by doing. 


Your lecturer will allow you time during lectures to perform the various exercises contained in 
the Workbook. 


Generally, after each exercise carried out during class time, 5 or 10 minutes will be spent as 
lecturer led whole group feedback time. 


After the lecture on a particular chapter you should ensure that you carry out the self study 
exercise provided at the end of the chapter. 


This workbook is the required textbook for the Programming Methods module. Throughout the 
workbook you will be given pointers to where you can find additional information if you should 
need it. This workbook is organised into eight subject-based chapters, each of which contain: 


Learning Outcomes 


These appear at the very beginning of each chapter and outline the skills you will have acquired 
on completion of the chapter. 


Introduction 


Each chapter will always contain an introduction to the subject matter. 


Text 
Substantial text will follow regarding the subject matter, which will be interspersed with: 


e Exercises — these have been provided in order for you to test your knowledge of 
the subject matter. Questions have been carefully chosen to reinforce the course 
material. 


Some exercises may be in the form of actions to be carried out and have been 
provided to reinforce learning, confirm understanding and stimulate thought. 


e Definition Statements — these statements define precisely the meaning of certain 
terms and the content in which they are used within the module. 


e Study Note — these are an aide-memoire and will help you to distinguish between 
text that is background knowledge and that which you should understand and be 
able to write about or carry out. You should pay close attention to these. They 
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are there to ensure you are fully aware of any important points raised within the 
text. 


e Self Study Exercises — these are usually provided at the end of each chapter. This 
section of each chapter contains directed self study actions designed to enhance 
the learning experience and allow students to explore topics to a greater depth. 


Summary 


Finally, each chapter ends with a summary detailing what you should understand and be able to 
write about, or actions which you should be able to carry out. If you reach this stage and do not 
understand and/or cannot describe and explain the topics detailed in the summary, then we 
suggest you re-read the chapter, re-do any exercises/activities suggested and perhaps consider 
some further reading. 


Feedback 


Your feedback will help us improve our workbook. Should you have any questions regarding the 
content of this workbook, or have any suggestions on how it could be improved, then please 
contact us, as detailed on the feedback sheet contained at the end of this workbook. 
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1 Learning Outcomes 


After completing this chapter you will be able to: 

e Describe the main advantages and disadvantages of each generation 
of language. 

e Understand the reasons for their development. 

e Explain the differences between each generation of language. 

e Describe the basic concepts of object-oriented technology. 


e Explain how object-oriented programming differs from structured 
programming. 


2 Introduction 


This chapter is intended to provide an overview of programming 
languages. It covers: 

e What is a programming language? 

e How programming languages have developed. 

e The language models. 

e Ways to evaluate their differences. 

e Programming from a chronological viewpoint. 


e An introduction to object-oriented concepts. 


Programming has developed gradually over the centuries, with a variety 
of social factors enabling this development to take place. 


The roots of programming lie in the abacus and the development of our 
numbering system in India and the Arab world. The first programs were, in 
effect, written for machinery such as looms and calculating machines 
during the Industrial Revolution of the 18/19th centuries. 


It has taken many centuries for programming languages to reach their 
current sophistication, and this chapter aims to explore that history. The 
final part of the chapter introduces object orientation, a feature of very 
recent languages. 


V1.1 1-3 


Chapter 1 — Introduction to Programming Programming Methods 


3 What is a Programming Language? 


There are many definitions of what constitutes a programming language, 
and there is not one absolute answer. A definition appropriate to a 
programming language of the 19th century would not necessarily be 
detailed enough for a modern definition. This chapter aims to encourage 
you to explore your own ideas of what you think a programming language 
might be. 


Programming languages are needed in order to allow humans and 
computers to communicate. Computers, as yet, are unable to understand 
our everyday language and the way we talk about the world. 


Computers understand logic expressed mathematically through what is 
known as machine code. Computer language consists of 1s and Os (the 
binary system), in which the majority of humans would find it very difficult 
to communicate. 


Programming languages enable humans to write in a form that is more 
compatible with a human system of communication. This is then 
translated into a form that the computer can understand. 


Here are some ideas on what constitutes a programming language. 


e A programming language has been defined as a tool to help the 
programmer. 


e A way of writing that can be read by both a human being and a 
machine. 


e A sequence of instructions for a machine to carry out. 


e A way for a human being to communicate with a machine which is 
unable to understand natural language. 


e Acomputer language offers a means of writing algorithms that can be 
understood by both humans and machines. Machines are unable to 
understand natural language, so a human being uses algorithms 
which are translated into machine code by the programming 
language. Machine code is difficult for humans to use, so a language 
‘translates’ human readable language into a machine-readable form. 


e A computer program offers humans a standard way of expressing 
algorithms to solve particular problems. As languages offer a 
convention, they allow other humans to read the program, and 
change it if they need to. 
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Exercise 1.1 [30 minutes] 


There is no absolute way to define a programming language, but some 
examples are provided to compare with the ideas of the class. 


Form into small groups. Allow ten minutes for group discussion, and to 
produce a definition of a programming language. The definitions should 
then be presented to the class, and discussed as required. 


Allow ten minutes for group discussion on the following points: 


e Are these examples saying the same thing as each other? 


e How do the definitions differ? 


4 Language Models 


There are currently several kinds of languages: 


e Imperative languages can also be referred to as Procedural languages. 
Imperative/Procedural languages are formed from collections of basic 
commands, most often assignments and Input/Output (I/O), where the 
execution is sequenced by control structures (e.g. loops, conditionals, 
blocks). Imperative (or procedural) languages specify explicit 
sequences of steps to follow to produce a result. These languages 
include C, Pascal, Java, VBScript and C#. 


e Functional languages are based on lambda-calculus of the 1930s. 
Programs consist of collections of function definitions and function 
applications. Higher order functions, abstract functions and lack of 
side effects might also be associated with functional programs. 
Examples include LISP, ‘pure’ Scheme, FP, ML and Haskell. 
Programs written in a functional language are generally compact and 
elegant, but have tended, until recently, to run slowly and require a lot 
of memory. 


e Logic programming consists of collections of statements within a 
particular logic. Typically that logic is predicate logic. The original logic 
programming language was Prolog. 


e Object-oriented (OO) languages are used in programs consisting of 
objects that interact with each other. Some also associate inheritance 
and polymorphism with OO languages. Examples are Simula, 
Smalltalk-80, Eiffel, Java and Visual Basic 2005. 


e Declarative languages are collections of declarations. Many functional 
and logic languages are also declarative. You describe a pattern to be 
matched without writing the code to match the pattern. Some patterns 
would be difficult to write by hand. Declarative languages describe 
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relationships between variables in terms of functions or inference 
rules and the language executor (interpreter or compiler) applies a 
fixed algorithm to these relations to produce a result. 


e Scripting languages work in conjunction with a larger application, 
support control of a variety of applications and are interpreted, or a 
combination thereof. They are designed to automate frequently used 
tasks that usually involve calling or passing commands to external 
programs. 


e Parallel languages are collections of processes (or sometimes 
structured data) that communicate with each other; these include C* 
and Ada. 


e Assembly languages directly correspond to a machine language such 
as FAB, D and MASM, in order to allow machine code instructions to 
be written in a form understandable by humans. 


e Multiparadigm languages support more than one programming 
paradigm. They allow a program to use more than one programming 
style. The goal is to allow programmers to use the best tool for a job, 
acknowledging that no one paradigm solves all problems in the 
easiest or most efficient way. Examples are Actionscript, Python and 
Curry. 


e Non-English-based programming languages do not use the keywords 
taken from the English dictionary. Examples of this are Chinese Basic 
(Chinese), HPL (Hebrew), var’aq (Klingon) and Lexico (Spanish). 


Definition: Algorithm 


A systematic procedure that produces — in a finite number of steps — 
the answer to a question or the solution to a problem. 


Definition: Lambda-calculus 


A system of mathematical logic Alonzo Church originated in the 1930s, 
which concerns the application of functions to their arguments. 
Lambda-calculus and its variations have been important in the 
development of computer programming languages. 


5 Evaluating Languages 


Programming languages can be evaluated from a number of viewpoints, 
depending on either the programmer, the environment in which the 
programmer works or the standards of the organisation. When developing 
software, a programmer should consider which language is most suitable 
to the task, rather than relying on a language with which they are familiar. 
For example, a spreadsheet is inappropriate for developing a database. 
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All the features of the language need to be considered rather than just 
one particular feature. A wrong choice can mean that the software has to 
be re-written, which can be both very frustrating and time-consuming. 
When developing a system, it is important to choose the language 
according to the goals that you have in mind. For instance, you may want 
to: 


e teach object-oriented programming, so you may choose Smalltalk; 


e create interactive multimedia and use Java, or create business 
applications in COBOL. 


If you want to create a large robust system, you probably would not 
choose a language used for rapid development. 


There are a number of different ways in which the programmer can think 
about the design of the system, from the top-down of structured 
programming to object-oriented design issues. Some languages are 
geared towards one particular style of design, whilst others incorporate 
many types. Each of these language paradigms enable the programmer 
to consider the problem from a different viewpoint. 


There are a few basic questions that can be asked to help in the decision- 
making process: 


1. How readable is the language, to humans? If parts of the program are 
going to be read or altered separately from the entire program, it 
might be worth considering how legible they will be. It is also useful to 
consider the length of names to be allowed in the language — for 
instance, an early form of Fortran allowed only six characters. This 
can lead to clumsy abbreviations that are difficult to read. Statements 
such as GO TO, FOR, WHILE and LOOP have increased the 
readability of programs, and lead to neater programs. These 
statements also affect the syntax, or grammar. 


2. How easy is the task of writing the program in this particular 
language? A programming language that is easy to write in can make 
the process easier and faster. It may help to reduce mistakes. FOR 
loops and other types of statement allow the programmer to write in 
much simpler code. This will save time and money, and also make the 
program smaller. 


3. How reliable is the language? Not all languages create robust 
programs, and some help the programmer to avoid making errors. A 
program that is not robust can cause errors, and code can ‘decay’. A 
language that helps the programmer to avoid mistakes will make it 
easier to use. 


4. What is the cost of developing the program in this language? Is the 
language expensive to use and to maintain? Programs may need to 
be updated or re-developed, and an expensive language may make 
this prohibitive. 
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5. How complicated is the syntax going to be? Syntax is an important 
consideration. Clarity and ease of understanding are important, as isa 
logical, sensible syntax. Errors are very likely to occur where one area 
of syntax too closely resembles another, and the program may prove 
difficult to debug. Some theorists reason that if it is difficult to write a 
program to parse the language, then it follows that it will be 
problematical for the programmer to get it right. 


6. Does the language have standards? Languages that have standards 
for writing programs have greater readability — for instance, Java has 
standards for naming, commenting and capitalisation. 


Definition: Syntax 


The rules defining the prescribed sequences of symbolic elements in a 
language. The syntax rules define the form of various constructs in the 
language, but say nothing about the meaning of those constructs. 
Examples of constructs are: expressions, procedures, and programs 
(in the case of programming languages); terms, well-formed formulae 
and sentences (in the case of logical languages). 


Exercise 1.2 [60 minutes] 


Algorithms, the lambda calculus and logic have all been influential in 
the development of programming languages. 


Search the web or find a book in your library that will show you some 
examples of algorithms in computing. You may find it helpful when 


looking online to use the keywords “algorithm + computing + 
introduction”, or try typing “introduction to algorithms in computing”. 
You may find it helpful to look for examples of programming code that 
show how programs differ when using different algorithms. 


Search for information on different types of languages to gain a better 
understanding of the main categories of language and the tasks they 
are best suited to. 


Definition: Parsing 


The process of deciding whether a string of input symbols is a 
sentence of a given language and, if so, determining the syntactic 
structure of the string, as defined by a grammar for the language. This 
is achieved by means of a program known as a parser, or syntax 
analyser. 
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Study Note 
Languages accepted by the academic community and industry: 


FORTRAN is an early specialised language used by the science, 
mathematical and engineering communities, both in academia and in 
industry. 


COBOL is still used in industry, and historically it has been the 
language of business and mainframes. 


Artificial Intelligence (Al), Virtual Environments and Natural Language 
are being increasingly used. For example, a game was developed in 
the UK using Al characters, which attracted attention from the army. 
The company who had produced it was then requested to develop 
software for the army using the game as a basis. 


Pascal was developed as a teaching language, and is not viewed as a 
bona fide development language. With the increased interest in object 
orientation, Pascal has declined as a teaching language and Smalltalk, 
or latterly Java, is often used as a teaching tool in its place. 


C++, Visual Basic, and Delphi are currently used as development 
languages in the UK, the latter two being RAD (Rapid Application 
Development) tools i.e. suited to quick prototyping and iterative 
development. Other visual languages include Visual J++, Visual C++ 
and Java Studio. 


The most common scripting languages used for web development are 
ActiveX (especially when related to plug-ins), JavaScript and VBScript. 
New developments such as Active Server Pages (ASP) can use any 
language, as long as it is specified. When ASP was developed, it was 
proprietary for the Windows Operating Systems and dependent on 
existing ownership of software, but otherwise essentially free for 
developers. ASP is now widely used and software has been developed 
to make it platform independent. Other database scripting languages 
include Cold Fusion, which is produced by the Allaire Corporation and 
a little more difficult to implement. 


Open source programming has grown with the growth of Linux. The 
Linux operating system is free and the code is available for other 
programmers. This concept of making codes available to the 
programming community has spread to other programs, such as 
Netscape. 
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6 Chronology of Programming 


Computation has been in existence for a lot longer than the computer. 
Machines or devices were developed to either use these computations or 
for their calculation. Though these systems may seem primitive, modern 
computing still uses some of these early systems of computation. The 
complex systems of calculation often developed hundreds of years ago 
form the basis of modern computer languages. 


The earliest tool used by European society for computing numbers was 
the abacus. Arabs introduced the abacus into Europe around the year 
1000AD, though the abacus itself was invented long before. 


Study Note 


Until the adoption of the abacus, arithmetic had been extremely 
difficult. European numbers were still based on the Roman system of 
letters. The year 999AD was written as DCCCCLXXXXVIIIJ, and 


became a simple M on the advent of the year 1000AD. 999 consisted 
of D (500) CCCC (400 or 4 lots of 100) L (50) XXXX (4 lots of 10) V (5) 
IV or Illl (4). It would have been a difficult process indeed to add 999 
and 999 together and it was even suggested at one point that 9000 
should be the maximum number (MMMMMMMM\I). 


With the abacus came not only a quick and easy means to make 
calculations, but also zero and infinity. The abacus revolutionised 
arithmetic, and introduced two vital concepts that had been missing. ‘Zero’ 
and ‘infinity’ are vital to modern numerical disciplines, and to computer 
languages. A computer understands the world in terms of the binary 
system of numbering, in other words as a series of 1s and Os. This would 
be impossible without the 0. 


The table in Appendix A shows a brief chronology of developments in 
computers and programming from prehistory to recent times. 
Developments in programming have also been dependent on the 
computer hardware itself. Until the invention of the personal computer 
(PC), programmers expected to share time on a mainframe computer. 
The mainframe was generally a gigantic computer that was given a room 
to itself. Modern personal computers have allowed programmers to work 
on their own machines, whilst the development of networks has meant 
that sharing has become possible. 
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7 The Language Generations 


The language generations span many decades, and begin with the 
development of machine code. Each generation adds new features and 
capabilities for the programmer to use. Languages are designed to create 
programs of a particular type, or to deal with particular problems. 


Modern languages have led to the development of completely different 
styles of programming, involving the use of more human-like or natural 
language and re-usable pieces of code. This section aims to give you an 
introduction to the features of each language generation. 


Note that although it seems as if each generation supersedes the others, 
the ‘older’ languages are still very much in evidence. You will find many 
programs written in C and COBOL, and engineers frequently use 
FORTRAN. 


e = The first generation of languages was machine language. Instructions 
and addresses were numerical. These programs were linked to the 
machine they were developed on. 


e The second generation allowed symbolic instructions and addresses. 
The program was translated by an assembler. Languages of this 
generation include IBM, BAL, and VAX Macro. These languages were 
still dependent on the machine they were developed on. 


e Third generation languages allowed the programmer to concentrate 
on the problem, rather than the machine they were writing for. Other 
innovations included structured programming and _ database 
management systems. 3GL languages include FORTRAN, COBOL, 
Pascal, Ada, C, and BASIC. All 3GL languages are much easier for 
humans to understand. 


e AGL languages (fourth generation). These are known as_ non- 
procedural, they concentrate on what you want to do rather than how 
you are going to do it. 4GL languages include SQL, Postscript, and 
relational database orientated languages. 


e 5GL (fifth generation). These languages did not appear until the 
1990s, and have primarily been concerned with Artificial Intelligence 
and Fuzzy Logic. Programs that have been developed in these 
languages have explored Natural Language (making the computer 
appear to communicate like a human being). 
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Study Note 


Structured programming was developed during the 1950s after Edgar 
Dijkstra’s insightful comments into the harmful nature of the GO TO 
statement. Dijkstra and others subsequently created a set of 
acceptable structures in programming that would enable development 
without GO TO statements. These structures produced programs that 
were easier to read by humans, easier to debug and easier to test. 
These structures have become some of the founding principles of 
modern programming methods. 


Although the principles of structured programming have had a profound 
effect on the programming world, it was not until the 1970s that an 
actual language was created for teaching structured programming. 
Pascal was developed especially for this purpose, though it is much 
derided as a ‘toy’ language, and appears to have never been used in 
commercial development. It appears that existing languages such as 
COBOL and FORTRAN were changed to accommodate Dijkstra’s 
structures, or that programming included these structures through more 
indirect methods. Later generation languages such as C are fully- 
fledged structured programming languages; these are from the third 
generation and procedural, in that they are both written and executed 


step-by-step. C, in its turn, has formed the foundation for the object- 
oriented language C++. 


The three structures allowed in structured programming are sequence, 
selection, and iteration. Structures are also thought of in terms of 
substitution and combination, i.e. structures can be substituted or 
combined with other structures, as long as the result equals a 
sequential structure. Structured programming also pays attention to 
design and testing with emphasis on a top-down approach. The top- 
down approach uses modularity as a means of ensuring that the 
program is both legible and manageable, and also that these modules 
can be tested as they are developed. This is beneficial as it ensures 
that all modules are tested and that bugs can be found in the modules 
that have most recently been added or altered. 


Structured programming also places emphasis on _ program 
documentation, which can be in the form of a chart or the structured 
coding/listing. This documentation allows for subsequent updating of 
modules, making these modules easier to locate in the program. 
Modularity also ensures greater opportunity for re-use of modules 
during development. 
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Definition: Rapid Application Development (RAD) 


A programming system that enables programmers to quickly build 
working programs. In general, RAD systems provide a number of tools 
to help build graphical user interfaces, which would normally take a 
large development effort. Two of the most popular RAD systems for 
Windows are Visual Basic and Delphi. 


Study Note 
Areas of change since 1993. 


There have been a number of changes in the programming community 
since the early 1990s. Perhaps the most widely publicised of these has 
been the development of Java. Other areas of innovation have been 
Rapid Application Development Packages such as Visual Basic and 
Borland Pascal. 


RAD applications have been heralded by their manufacturers as the 
solution to programming without programmers. Applications can be 
developed using a ‘visual’ or ‘drag and drop’ environment. Whilst it is 
the case that these programs do indeed speed up development, there 
are inherent problems with RAD that need to be considered during 
development. Emphasis on speed of development can mean that good 
design is neglected, and the re-usability of components is not 
optimised. Time ‘saved’ during development can lead to time lost after 
production, as the application can become unmanageable and difficult 
to update. 


Other areas of change include: 


e Web-based mark-up languages such as Cold Fusion and Active 
Server Pages that interface with databases to provide dynamic web 
content. 


A move towards distributed objects rather than the large mainframe 
applications of the past. 


The emergence of ActiveX and Java applets — mini programs that 
run independently, therefore can be used to customise and create 
interactive websites. 


Visual Basic being thoroughly integrated with Microsoft Office to 
customise programs. 


During the 1990s networks, TCL (Tool Command Language), PERL 
(Practical Extraction and Report Language), CGI (Common 
Gateway Interface) etc. have all become common in industry. 


ANSI (American National Standards Institute) standard for C++. 
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Definition: TCL 


Short for Tool Command Language, and pronounced TCL or tickle, a 
powerful interpreted programming language developed by John 
Ousterhout. One of the main strengths of TCL is that it can be easily 
extended through the addition of custom TCL libraries. It is used for 
prototyping applications as well as for developing CGI scripts, though it 
is not as popular as PERL for the latter. 


Definition: CGI 


Abbreviation of Common Gateway Interface, a specification for 
transferring information between a World Wide Web server and a CGI 
program. A CGI program is any program designed to accept and return 


data that conforms to the CGI specification. The program could be 
written in any programming language including C, PERL, Java or 
Visual Basic. 


Definition: PERL 


Short for Practical Extraction and Report Language, PERL is a 
programming language developed by Larry Wall, especially designed 
for processing text. Because of its strong text processing abilities, 
PERL has become one of the most popular languages for writing CGI 
scripts. PERL is an interpretive language that makes it easy to build 
and test simple programs. 


8 Introduction to Object-Oriented Concepts 


Programmers who undertake structured analysis, design and 
programming, need to think about data items and how to manipulate and 
organise them. Object orientation offers a different way to think about 
software systems. 


The philosophy of object orientation proclaims that applications can be 
built by envisioning objects that work together. In theory, this should be 
an easier way to develop programs because we live in a world made up of 
objects, know how they relate to one another, and ought to be able to 
transfer our understanding of them to the equivalent in software terms. 


Object orientation allows a developer to construct systems based on the 
idea of components, as opposed to the structures that form the basis of 
structured programming. This component-based development allows 
objects to be re-used or extended, reducing development time 
dramatically. 
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Object-oriented programming allows the following: 


e Asystem can be constructed from a set of objects — just as a house is 
built from bricks, windows and doors (amongst other things). 


e Adding new capabilities to existing objects can expand a system -—a 
door for a house could be a door to a room or a cupboard door, or this 
could be extended to include garage doors. 


e Creating new objects can expand a system; for example a house 
object may have stairs or a lift. No stairs could mean a bungalow, 
stairs would allow the house to have more than one floor, and a lift 
would allow a tall block of flats to be built. A house could be extended 
by including a conservatory. 


e The bricks, windows and door objects, which were designed for the 
first house, can also be used to build other houses. This reduces the 
development time, as the existing design can be used to create other 
objects. 


The purpose of this introduction to object-oriented programming is to 
encourage you to start thinking in terms of objects, as this approach is 
increasingly popular in general software development. 


9 What is an Object? — An Introduction to 
Classes 


The world around us is full of objects and modern software aims to imitate 
the real world. Just like things in the real world, objects have state and 
behaviour. 


An object’s state can be thought of as the features that describe it; a car 
may have an engine type, a make, model number and a colour. 


The car’s behaviours are the things that it knows how to do, such as 
accelerate, brake, change gear and turn the windscreen wipers on and 
off. 

The state for a dog could be name, breed, colour, and whether hungry. 
The dog can have behaviours such as snoring, barking, fetching, and 
chasing cats. 

An object can be: 

1. A physical thing in the ‘real world’. 

2. A representation of reality. 


3. A tangible or visible thing. 


4. A thing to which action or thought can be directed. 
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5. Passive — doing nothing until activated e.g. a switch. 


6. Active — continually monitoring until conditions change, e.g. a 
thermostat. 


An object is never: 
1. A value (e.g. name). 
2. A process (e.g. sort). 


3. Time (e.g. five minutes). 


10 Classes 


An object is an instance, or example, of a category or class. The house 
system described earlier is a class or category, and the types of housing, 
e.g. apartment, bungalow, are instances of that class. In the real world, 
we are instances of the people class, and we may also be instances of 
other classes such as the brother class, mother class, and student class. 


Software objects also have a state and behaviours; these form the 
structure of the object. The object’s state is made up of items of data 
called attributes (or properties), which describe aspects of the object, 
whilst its behaviours are the operations that the object carries out. 


Definition: Attribute 


An attribute is a property of class. It describes a range of values the 
property may hold in objects (that is instances) of that class. Every 
object of the class has a specific value for every attribute. 


Definition: Property 


Characteristic of an object. In many programming languages, the term 
property is used to describe attributes associated with a data structure. 


Real world examples of object features are those that we might possess 
as a person. As instances of the person class, our attributes could be 
height, age, weight, nationality (note that the data for each of these 
attributes will be different in every case, because we as individuals are 
unique), and our operations consist of eating, sleeping, drinking, going to 
college, doing homework, playing football and swimming. 


A developer wishing to create a system based on_ people-related 
information which could be used for a marketing purpose, may look at the 
following attributes and operations for the person class. Attributes may 
consist of age, name, address, telephone number, email address, income, 
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and operations could include takes holidays, reads magazines, drinks 
alcohol, goes shopping, and drives cars. 


Classes have another advantage; they can be used as a template for 
creating other objects. The person class would give you a template for 
creating other instances of the person class. You have the template for 
creating a population. 


The computer class may have a series of attributes such as brand name, 
model name, size of hard drive, processor speed. The computer class 
operations may consist of play latest games, play CD-ROMS, send email, 
surf web, and play music. The computer class can act as a template for 
creating new instances of the class, i.e. new objects. 


To put this more simply, the class can be thought of as a mould for 
producing new objects. 


This ability to create new objects easily is an important aspect of object- 
oriented software development. An object in the real world has many 
attributes and behaviours, and it follows that the more features you 
include with your class, the more it will reflect reality. For example, a 
games computer could include the following attributes in its specification: 
how many MBs memory, the type of graphics card, floppy disk, etc. There 
are many other operations that the computer can perform, from desktop 
publishing to recording music. 


Classification 


Objects that share similarities and properties can be grouped together. 
Classification looks at the shared attributes and behaviours to create or 
describe classes of objects. 


A biologist studying a group of animals such as lizards, crocodiles, 
chameleons, turtles, may decide that these belong to a class known as 
reptiles. These reptiles share behaviour such as being cold-blooded and 
laying eggs, as well as data such as how big they are and what colour 
they are. 


Classes are defined by specifying data or attributes and behaviour 
(Operations or methods). The data for an object can also be referred to as 
its variables, and its operations or methods are known as _ its 
responsibilities. There are a number of different terms for responsibilities, 
and they may be referred to as operations, methods or functions. 


The reptiles class defines the data and responsibilities for the class as 
follows: 
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Reptiles Class 


Data Colour 
Size 
Responsibilities Laying eggs 


Rearing young 


Figure 1.1 An Example of a Class 


Definition: Variable 


A unit of storage that can be modified during program execution, 


usually by assignment or read operations. A variable is generally 
denoted by an identifier or name. In logic, a variable is a name that 
can stand for any of an infinite set of values. 


Responsibilities and data can be both stored in the class as a whole or in 
each instance of a class. The responsibilities or operations for the reptiles 
class could include the ability to grow. An instance of the reptiles class 
could be grow to a particular size. Some instances of the reptiles class 
would be able to grow scales or be able to swim. The ability to move 
would be a class method, whilst the ability to swim would be an instance 
method. 


As a lass, people have eyes, but the colour of our eyes varies, So in any 
one instance, the colour of eyes is a variable. Likewise, a consumer class 
for marketing information might be as follows: 


e Class variables — total number of consumers interviewed. 


e Instance variables — address of particular consumer. 


Definition: Data 


Information, in any form, on which computer programs operate. The 
distinction between a program (instructions) and data is a fundamental 
one in computing. It is in this fundamental sense that the word is used 
in terms such as data bus, data cartridge and data protection. 


Data can be distinguished from graphics, text, and speech. Data is 
distinguished by the fact that it is organised in a structured, repetitive, 
and often compressed way. This definition of the term data relates 
closely to database, data independence, data model and data 
processing. 
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Definition: Operations or Methods 


Operations or methods in a programming language. Whatever is 
carried out by an operator or, more generally, anything that can take 
place within a program: a declaration, an assignment, a selection, a 
loop, the call of a function and so on. 


10.2 Different Sorts of Classes 


V1.1 


There are two sorts of class that can be created, the abstract class and 
the concrete class. 


An abstract class may have one or more subclasses, but never an 
instance. In other words, an abstract class may be inherited by another 
class, but cannot become an instance of that class. A reptile is an abstract 
class, its data and responsibilities can be inherited, but a reptile is not an 
animal in its own right. 


A concrete class however, is a class that can have one or more 
subclasses and/or instances. Instances of the reptile class such as lizard 
and crocodile, are concrete classes. They may have subclasses such as 
iguana, and occur a number of times — have many instances. 


Exercise 1.3 [60 minutes] 


Find a simple taxonomy for the animal kingdom. It can be any part of 
the classification that you are familiar with. For instance, you can look 
at cats as a group, with everything from lions and tigers to the domestic 
cat. 


Take a look at the animals that are grouped together. What attributes 
do they share? Are there any behaviours that are common to all of 
these animals? Are there any attributes that are specific to particular 
animals? Do these animals have special kinds of behaviour? 


List all the shared attributes and behaviours that the animals have. 


Next, list the more specialised attributes and behaviours along with the 
animals or smaller groups of animals to which they belong. 


Exercise 1.4 [60 minutes] 


Find some_ specifications for computers produced by different 
manufacturers. These may be from company websites or computer 
magazines. Write a list of all the attributes common to these 
computers. 
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11 Encapsulation (Information Hiding) and 
Abstraction 


Information hiding is an important concept in object-oriented systems, and 
takes place in two important ways. Firstly, information about objects is 
hidden from other objects by the system itself; this is known as 
encapsulation. Secondly, the developer can choose to hide information 
about a class or object to ‘streamline’ that object to suit their needs. This 
is known as abstraction. 


11.1 Encapsulation 


Encapsulation can also be referred to as black box technology. Black box 
refers to technology which shields the user from its mechanics or 
technological workings. Most modern electronic appliances hide their 
technological processes from users. 


Appliances such as televisions, radios, etc., hide how they show television 
and radio programmes, and all the user sees are those processes being 
carried out. Most television viewers or radio listeners are not interested in 
how their equipment works, merely that it does. With encapsulation, an 
object bundles together its attributes (data) and its operations so that they 
are inseparable, they are part of the object. This information is hidden 
(hence information hiding), so that only the object Knows about it and can 
act on it. 


For instance, a computer knows what processor it is using and how 
efficiently it is running, but there is no direct way to find out. Another 
example might be that human beings know how hungry they are and how 
to eat, but there is no direct way to find out how hungry a person is or how 
the digestion system works. To find any of these things out, the object 
must be sent a message, just as a human being would be asked whether 
or not he/she is hungry. 


Encapsulation offers a distinct advantage to developers, as it reduces the 
potential for error. A broken object can affect the whole system, and if the 
software engineer needs to repair that object, it may mean that the object 
can be repaired individually, rather than repairing the whole system. 


In the real world, a television is usually linked to a video recorder, but they 
hide their operations from one another. If the television breaks down, it is 
only the television that needs repairing or replacing, rather than the 
television, video and any other devices that might be attached. 


Encapsulation and information hiding mean that only an object can 
manipulate its data, through using its own methods or operations. This 
prevents other objects from accessing its insides, and prevents accidents. 
A human digestive system knows about its data and operations, but does 
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not make these available to the outside world. This prevents accidents 
happening to an important object in the human body. 


Although an object hides its operations from the outside world, it does 
allow a means of communicating with the outside world. This means of 
communicating is known as an interface. Interfaces are a feature of 
modern household items as well as encapsulation. Televisions and videos 
have remote controls to change channel and alter volume, microwaves 
have buttons or dials to set temperature and cooking duration, washing 
machines have dials to choose a washing cycle. Computers communicate 
with users through an interface, for instance, operating systems and other 
software packages which are used. 


Exercise 1.5 [60 minutes] 


Find as many electronic objects in your house or classroom as you 
can. Alternatively, you can visit an electrical shop. These machines 
should all have some kind of interface for communicating with the 
outside world. 


What sorts of interface do these objects have? Do they differ according 
to the type of appliance, or are they part of the appliance’s design? 


Study some of the more familiar machines, or find some information 


about how they operate. What behaviours is the interface hiding? Is it 
useful to be able to hide these behaviours? 


Exercise 1.6 [30 minutes] 


Think about the interface to a washing machine. What features does 
that interface normally have? Write down a list of features in the 
interface. Now think about how the washing machine behaves. What 
Operations does it hide (encapsulate) from the person using the 
washing machine? 


11.2 Abstraction 


V1.1 


Abstraction enables the developer to re-use a class and filter out 
operations and attributes from that class which are superfluous to needs. 
An object may have a long list of associated features that are not always 
relevant to the system the developer wishes to create. 


For instance, the marketing information class may provide extensive 
details of consumer lifestyles. A developer may decide to produce a 
system to sell information about pet-owning consumers to pet food 
manufacturers. The developer could choose to filter out operations such 
as the consumer’s favourite restaurants and hobbies, which newspapers 
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they read and how many visits they make to the cinema. The pet food 
manufacturer may only be interested in where pet owners live, where they 
shop, whether they travel by car, how many pets they have and of what 
type, and what brand of pet food they buy. 


The software developer may also need to create software that tracks the 
number of consumer profiles in each area in order to find which areas are 
under-represented. In this case, it may be that the detailed information 
recorded for each profile may be more than is needed for the system. 
Consumer spending habits could be left out, leaving just the basic 
demographic information for each consumer. 


This consumer profile can be filtered in any way to suit the software 
developer’s needs, and whatever the developer is left with after this 
process is an abstraction of the consumer profile. 


12 Messages and Operations 


Objects communicate with each other by sending messages to one 
another. An object sends a message to another object telling it to perform 
an operation. This works in the same way as remote controls that send 
messages to objects such as televisions, to tell them to perform 
operations such as changing channel. The message itself only tells the 
object to perform the operation, not how to perform it. The object itself 
knows how to perform this operation from the class it belongs to. 


12.1 Methods 


There is another fundamental difference between traditional programming 
and object orientation. Data traditionally has things done to it, whereas an 
object can do things. This ability to do things is called methods (or may be 
referred to as operations — not strictly accurate, but common practice). In 
programming terms, the difference would be: 


Traditional data item: open (door) 
Object: door open 


An object’s methods are not necessarily available to all other objects in 
the system. Some methods can be accessed by outside objects and 
these are known as public methods. Other methods are only available by 
methods within the object itself, and these are known as private methods. 
In the real world, a computer would have some public methods that 
enable a human to interact with the system, such as ‘open program’ and 
‘change screensaver’. There are however, other methods that are 
available only to the computer itself, such as accessing the RAM. 
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Exercise 1.7 [30 minutes] 


Open a program on your computer such as a word processing or 
drawing package. For instance, open a word processing package and 
think about what tasks you normally complete to produce a word- 
processed essay. Think of each button on the toolbar at the top as an 
object. What operations can each button carry out? If you are not sure, 
open a new document or open an old one for editing. Make alterations 
to the text or add new content to the document. What is the button 
object doing to the document or to the selected part of the document? 
For example, the ‘bold’ button can embolden the text or can 
unembolden it (return it to normal). 


Simple button objects for making text bold, italic or underlined: 


BZU 


Below is a picture of the dialogue box for formatting the fonts in your 
document. This shows you the many operations that can be performed 
from this dialogue box. 


Font 


Font | Character Spacing | Animation | 
Font: Font style: 
imes New Roman Regular 


That's Amore 
The Aeroplane Flies High 
The Monkies Ate My Soul 
Tikitype 
Times New Roman 

Underline: 

{none} 

Effects 
Tl” Strikethrough [ Shadow T~ Small caps 
T” Double strikethrough T Outline Tl” Allcaps 
T Superscript Tl” Emboss Tl Hidden 
T~ Subscript ” Engrave 

Preview 


shop 


This is a TrueType Font, This font will be used on both printer and screen, 


Default... | Cancel | 
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Exercise 1.8 [30 minutes] 


Take a look at your computer and the messages you are sending it 
using your keyboard and mouse. Are there any that are the same? 


If the messages are the same, are they the same whether they are 
sent via the mouse or the keyboard? 


12.2 Messages 


When an object receives a message from another object, it activates a 
method or operation. These messages are also known as requests or 
function calls. The television remote control sends a message to the 
television to perform an operation or activate a method, such as change 
channel. This process of transmitting a message from the message 
sender to the message receiver is called message passing. The message 
sender does not know how the message receiver carries out this method 
as this information is hidden, hence the term information hiding. 


Messages may also contain arguments (parameters) to clarify the 
request. For example, the television remote control will tell the television 
to change to a particular channel, or to reduce the volume by one unit. A 
microwave can be asked to cook for a certain number of minutes, and a 
video to turn itself on at a particular time. 


13 Relationships 


There are a number of relationships that objects can have with each 
other. Principal among these is inheritance and association. 


13.1 Inheritance 


Classes are categories of objects, and an object is therefore an instance 
of a particular class. In providing a template for objects, a class also 
provides characteristics that are inherited by objects. A television object 
therefore inherits all the attributes and operations of the television class. 
Not only can an object inherit from its class, a class may also inherit from 
another class. 


This process is called inheritance. For example, a rice cooker, toaster, 
microwave and electric kettle are all classes in their own right, but each 
class is also a member of the kitchen appliance class. The kitchen 
appliance class is the parent or superclass of all the others, and these 
other classes inherit the characteristics of the kitchen appliance class; 
e.g., Switch on, switch off, and have buttons to turn the appliance’s 
operations on and off accordingly. The rice cooker, toaster, microwave 
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and electric kettle are all therefore subclasses of the kitchen appliance 
class. 


Superclasses can also be thought of as a form of taxonomy. In the real 
world, mammals share certain inherited characteristics such as bearing 
live young, being warm-blooded and feeding milk to their offspring. 
Subclasses of the mammals superclass inherit these properties, as well 
as their own particular characteristics. For instance, whales have fins and 
can swim underwater, whereas cats have four legs and fur and usually 
remain on land. 


Definition: Taxonomy 


In a broad sense, the science of classification, but more strictly, it is the 
classification of living and _ extinct organisms, i.e. biological 
classification. The term is derived from the Greek word ‘taxis’, meaning 


‘order’. 


Exercise 1.9 [30 minutes] 


Consider the mammals superclass, and define some subclasses within 
that group, for example, dogs and cats. What characteristics would 


| these subclasses inherit? | 


Exercise 1.10 [30 minutes] 


Get together with a friend or think about this on your own if you prefer. 
What characteristics do you share with other people in your group? 


You are going to develop the student superclass. Write down the 
attributes and operations that you all share. Can you think of any 
subclasses for your student superclass? 


Association 


Objects are often associated with each other, in the same way that people 
might be associated with each other in daily life. These object 
relationships can either be one-way or they can be two-way. Objects can 
also have relationships which are one-to-one, or they may be one-to- 
many or, alternatively, an object can have different relationships with the 
same object. These associations are known as multiplicity. 


A one-way or uni-directional relationship exists between a human being 
and a television set. The human can tell the television which operations to 
perform, but the television has no means of telling the human to perform 
operations. A two-way or bi-directional relationship exists when two 
people are married. In a workplace, an employee will have a manager. 
This relationship is “is the manager of’ in one direction and “is managed 
by” in the other direction. Typically, a manager will have a relationship 
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with many employees and each employee will have a relationship with 
only one manager. 


Classes can also associate with more than one class. For example, a 
person can travel both by boat and aeroplane. The person class is 
therefore associated with both the boat and the aeroplane class. 


13.3 Multiplicity 


Multiplicity is a term indicating the number of objects that are associated 
with one particular class. These associations can be either one-to-one or 
one-to-many. For instance, a wife has one husband; they have a one-to- 
one association. A mother, however, may have many children and she 
has a one-to-many association with those children. Multiplicity is very 
common in the real world; human beings walk on two legs (a one-to-two 
association), dogs walk on four legs (a one-to-four association), spiders 
walk on eight and insects on six. 


14 Polymorphism 


Often developers will find that operations can have the same name even 
though they are associated with different objects. Polymorphism allows 
developers to re-use terminology and allows it to have more than one 
meaning. This can prove useful for a number of reasons. System 
modellers can talk to clients using terms that are familiar to them, 
maintaining the client's own terminology. Some operations fall naturally 
into certain terminology, such as open and close, and it would be 
preferable to use words that have an obvious meaning. The ability to 
allow more than one meaning for each operation means that the 
developer can maintain terminologies without having to invent a new 
unique word every time a similar operation occurs. 


Polymorphism means that each object understands how it is supposed to 
perform an operation even though it may have the same name as another 
object’s operation. For instance, just as a human understands how 
different objects in the real world open, so each class in object orientation 
understands how that operation occurs. A human being performs the 
operation ‘open’ on many different items in every day life, from doors to 
boxes, books, bank accounts and conversations. These are all different 
types of operation, although they share the same name. Each class would 
understand how it is supposed to perform the operation ‘open’ in its own 
right. 


The term true polymorphism means that objects may share closely related 
method names. These polymorphic methods perform the same operation, 
but perhaps in a different way. Usually this stems from the object’s 
inheritance, as the different class methods may be related by their 
superclass. 
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15 Summary 


In this chapter we have covered: 

e A general introduction to programming, including the concept of a 
programming language. 

e Achronology of programming. 

e The idea of language generations. 

e An introduction to structured programming. 

e Anintroduction to object-oriented concepts, such as: 
— classes; 
— encapsulation and information hiding; 
— messages and operations; 
— relationships; 
— polymorphism. 


There are many languages and a number of different approaches. 
Structured programming brought a measure of order to what was 
previously a chaotic process. Now there is increasing emphasis on 
viewing software as objects, rather than concentrating on data. This is the 
reason for introducing the topic of object-oriented concepts so early, but it 
will reappear throughout this workbook. 


16 Self Study 


Self Study 1 [30 minutes] 
List the six key questions to ask when evaluating languages for 
use in a project. 


What are the characteristics of each of the five generations of 
language? 


How might we know whether a language fits into one generation 
rather than another? 


What criteria do we use? 

What was Dijkstra’s contribution to programming? 

What was the main problem he was trying to overcome? 
What was his solution? 


What do we mean by structured programming? 
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Self Study 2 [60 minutes] 
How does object orientation encourage you to think? 
How is a system constructed? 
How is a system expanded? 


What are the supposed advantages of object-oriented 
programming? 


What is meant by state? 

What is meant by behaviour of objects? 

What are the general criteria for grouping objects? 
What types of classification are there? 


What is the difference between an abstract and a concrete 
class? 


What is meant by the term encapsulation? 
Why is encapsulation important in the world of objects? 


Describe the two key sectors of relationship, inheritance and 
associations. 


What are the differences between them? 


Get into the habit of thinking in classes. Look around you and 
categorise things. 


You will be meeting new words and new concepts. 


Back up these materials with further reading and thinking. 


It will take time to adjust. 


Self Study 3 [60 minutes] 


Look at the history of computing — key events table in Appendix A. 
Pick five events since 1960 and research them for further details. Find 
out more about the people and achievements involved, especially the 
problems they were trying to solve. How did their successes advance 
programming? 
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1 Learning Outcomes 


At the end of this chapter you will be able to: 


e Understand the importance of the order of precedence. 
e Describe control structures. 
e Use calculations to construct computer programs. 


e Describe the different variable types and naming conventions. 


2 Introduction 


Every programming language has rules which must be followed in the 
same way as a spoken language is based on rules. The English language 
consists of words which must be used in a grammatical way to have 
meaning. For example, the words computer, mouse, screen and data 
are nouns and click, enter and use are verbs. These are examples of 
the syntax of a language. The sentence “You can use the mouse to click 
on the screen to enter data into the computer” is a meaningful sentence. 


When learning the English language, you may start with words such as 
nouns and verbs, moving on to adjectives and building phrases. 
Sentences are then developed, moving on to whole paragraphs, which 
consist of sentences relating to the same topic. 


Definition: syntax (syntax rules) 


The rules defining the legal sequences of symbolic elements in a 
language. The syntax rules define the form of the various constructs in 
the language, but say nothing about the meaning of those constructs. 


Definition: language constructs 


A syntactic structure or set of structures in a language to express a 
particular class of operations. The term is often used as a synonym for 
control structure. 


We can relate a programming language to the English language. The 
building blocks are not nouns and verbs, but reserved words and 
keywords which have a special meaning in programming languages. 
Reserved words are unique to a program and cannot be used as 
identifiers. Each program will have a different set of reserved words. 
Examples are IF, REPEAT. Keywords are identifiers which indicate a 
specific command. Examples are PRINT, INPUT. These terms will be 
addressed in more detail in Chapter 4. 
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Definition: reserved word 


A word that has a specific role in the context in which it occurs, and 
therefore cannot be used for other purposes. For example, in many 
programming languages the words ‘IF’, ‘THEN’, ‘ELSE’ are used to 
organise the presentation of the written form of statements (between 
‘THEN’ and ‘ELSE’ and following ‘ELSE’) whose execution is governed 
by the value of the Boolean expression between ‘IF’ and ‘THEN’. The 
use of IF, THEN or ELSE as identifiers is thus not permitted in these 
languages since they are reserved words. 


See also keyword. 


Definition: keyword 


A symbol in a programming language that has a special meaning for 
the compiler or interpreter. 


For example, keywords in BASIC include IF, THEN, PRINT. 


The keywords guide the analysis of the language, and in a simple 
language each keyword causes activation of a specific routine in the 
language processor. 


Since this workbook is concerned with concepts associated with 
programming in general, rather than a particular programming language, 
you will be using pseudocode throughout the workbook. 


Pseudocode is one of the tools that can be used to write a preliminary 
plan, which can then be developed into a computer program. It is not a 
standard language, although programmers often use terms within it that 
closely resemble the actual language to be used. Its purpose is to 
describe the algorithm (the method of solving the problem) in a form that 
can easily be understood and then translated into the actual programming 
code required. The syntax of the language to be used and the fine detail 
of the program are ignored until writing the source code, i.e. the program, 
which will be compiled (translated into binary digits that the computer can 
use) later. 
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Definition: pseudocode 


Another name for pseudolanguage. 


A form of representation used to provide an outline description of the 
specification for a software module. Pseudolanguages contain a 
mixture of natural language expressions embedded in syntactic 
structures taken from programming language (Such as IF ... THEN 
...ELSE). The formality of the definition varies from ad hoc (e.g. 
defined within the project team) to being sufficiently formal to enable 
automatic parsing and syntax checking (e.g. supported by a CASE 
(Computer Aided Software Engineering) tool). Pseudolanguages are 
not intended to be executed by computer; they must be interpreted by 
people. 


You will also be drawing diagrams to represent the algorithm. The 
diagrams are similar to flowcharts, although the official standards are not 
used. Flowcharting was used extensively in the 1960s before 
pseudocode and other charting techniques were available. The diagrams 
used have been chosen as compatible with more modern techniques of 
charting used later in the course for the object-oriented programming 
languages. 


The first problem you will consider is to input three numbers from the 
keyboard of a personal computer, add them together and output the 
result. This can be thought of as a more general problem in that data is 
accepted, a process is performed on the data and the results are returned 
to the user or retained for use at a later stage. 


input the 
numbers 


calculate 
the sum 


print the 
answer 


Figure 2.1 Solution in the form of a diagram 
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The steps involved in solving this problem are shown in Figure 2.1. 


There must be a method of accepting data and storing it in the computer’s 
memory. The computer program also needs to know where the data is 
stored in the memory so that it can process the data. Programming 
languages use variables for this purpose. 


3 Variable Types and Names 


Variables are used to store data in the computer’s memory (RAM) during 
the time that the program is running. At this stage we will consider three 
attributes associated with variables. They are: 

e the address in the memory where the data is stored; 


e the actual data stored which can change during the execution of the 
program; 


e the name of the variable or identifier. 


Definition: variable 


1. A unit of storage that can be modified during program execution, 
usually by assignment or read operations. A variable is generally 
denoted by an identifier or by a name. 


2. The name that denotes a modifiable unit of storage. 


In the example, to calculate the sum of three numbers you would need 
four variables. Programmers can choose the names of the variables, with 
certain limitations, which we will address shortly. It is good practice to 
think of names which are meaningful in the context of the program, as it 
will help you to understand the coding. In this case we could use 
number1, number2, number3 and sum. 


The pseudocode could be written as: 


Begin 
ACCEPT number1, number2, number3 
Sum:=number1 + number2 + number3 
PRINT sum 

End program 


When you declare number1 as a variable, an address in the memory is 
allocated to it and the number input through the keyboard is stored in this 
memory location. Every time the identifier number1 is used in the 
program, this memory location would be accessed and thus the data 
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stored in it can be read and manipulated. The programmer uses the 
identifier to gain access to the data, the generated program code deals 
with the connection between the variable and the memory location. 


It was mentioned earlier that there are certain limitations on the names 
which can be used for variables. The name chosen must conform to 
certain rules so that the computer you are using to build your program 
does not receive ambiguous instructions. 


This means that you cannot use any of the words that make up the 
programming language. These words are the reserved words and, as 
stated previously, they have a special meaning in the programming 
language. As all programming languages have a unique list of reserved 
words, you should make sure you are familiar with the ones used in the 
programming language you will be learning. Punctuation marks and 
spaces may also have a special meaning within the grammar or syntax of 
the computer language you are learning, so these cannot be used as part 
of a variable name, either. You should make sure that you know how to 
create valid variable names in the programming language you will be 
using. In the pseudocode you have used so far, the words ACCEPT and 
PRINT can be thought of as reserved words. 


Identifiers are usually written as text, although there are differences 
between programming languages in terms of the additional characters 
that can be used. In the pseudocode we will use the underscore symbol 
to combine text to provide a meaningful name, e.g. average_student for 
the average mark for a student. In some languages this would be written 
as AverageStudent. 


Definition: identifier 


A string of characters used to identify (or name) some element of a 
program. The kind of element that can be named depends on the 
programming language; it may be a variable, a data structure, a 
procedure, a statement, a higher-level unit, or the program itself. 


Definition: assignment statement 


A fundamental statement of all programming languages (except 
declarative languages) that assigns a new value to a variable. The 
typical form in Algol-like languages is variable:= expression where := is 
read as “becomes”; the symbol suggests a left-pointing arrow to signify 
the conveyance of a value to the variable on the left. Other languages 
(particularly BASIC, C and FORTRAN) use = as the assignment 
operator, e.g. a= b+. This leads to problems in expressing the 
concept of equality. BASIC, being an unsophisticated language, is 
able to use = for both purposes; C uses = = for equality and FORTRAN 
uses EQ. 
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Definition: name 


A notation for indicating an entity in a program or system. (The word 
can also be used as a verb.) The kinds of entity that can be named 
depend on the context, and include variables, data objects, functions, 
types, and procedures (in programming languages), nodes, stations, 
and processes (in a data communication network), files, directories, 
devices (in operating systems), etc. The name denotes the entity, 
independently of its physical location or address. Names are used for 
long-term stability (e.g. when specifying a node in a computer program) 
or for their ease of use by humans (who recognise the name more 
readily than an address). Names are converted to addresses by a 
process of name lookup. 


In many languages and systems, a name must be a simple identifier, 
usually a textual string. In more advanced languages, a name may be 


composed from several elementary components according to the rule 
of the language. 


3.1 Variable Types 


The compiler or interpreter needs to know what type of data is being used 
because it will have a different way of dealing with whole numbers, 
fractional numbers and characters. As you will see later, the type of data 
being used has a great influence on program design. 


Definition: compiler 


A program that translates high-level language into absolute code, or 
sometimes into assembly language. The input to the compiler (the 
source code) is a description of an algorithm or program in a problem- 
oriented language; its output (the object code) is an equivalent 
description of the algorithm in a machine-oriented language. 


Definition: interpreter 


A language processor that analyses a line of code and then carries out 
the specified action, rather than producing a machine-code translation 
to be executed later. 
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The different types of data are represented as follows: 


Whole numbers (e.g. 1 678 45) Integer 
Fractional numbers (e.g. 3.678 789.3) Real or Float 
Letters (e.g. a f Zz) Character 
Names (e.g. Dr. H. Kelly) String 
FALSE or TRUE, 0 OR 1 (only two values Boolean 
allowed) 


English postal codes contain letters, numbers and a space (which the 
computer treats as a blank character code rather than as an absent 
character), e.g. SW1A 2AJ. 


The computer would use this data as a string because it has a blank 
character in it. Numbers are only declared as integer or real if they are 
going to be used in calculations. 


Exercise 2.1 [45 minutes] 
Choose an appropriate data type for variables to be used in each of the 
following examples: 

a) A metric measurement. 

b) A number of people. 

c) The total of a grocery bill. 

d) Four letters used as choices in a multi-choice examination. 


e) An international telephone number e.g. (01234) 123456. 
f) A catalogue code e.g. AZ1234. 


Many programming languages insist on a variables definition at the 
beginning of the program. This emphasises the importance of data in 
deciding the structure of the program. It is good practice to identify the 
variables used and the types in the pseudocode. It is also good practice 
to describe any printed or displayed output, rather than just printing a 
number. You must remember that the output and input to the program 
are the human-computer interface and as such must be meaningful and 
helpful to the user. The pseudocode is now: 


Use variables: number1, number2, number3, sum OF TYPE Integer 
DISPLAY “Enter three numbers" 
ACCEPT number1, number2, number3 
sum:= number1 + number2 + number3 
PRINT “The sum is “ sum 
end program 


Although variables are used to hold data which can change throughout 
the program, the data stored is always under the control of the program. 
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Some variables may have data entered at the beginning of the program 
which stays the same throughout the running of the program (e.g. tax 
rate). You may wonder why a variable is used in this case, but consider 
the implications if a program performed calculations such as: 
tax_amount:= price*17.5/100. The program code would have to be 
changed if the tax rate changed from 17.5%. Data items which have the 
same value throughout the program are termed constants. As with 
variables, the computer needs a name for each constant and to know its 
data type so that it can be manipulated as required during the program. 
Constants and variables obey the same rules and can be compared or 
used in arithmetic expressions, provided they are of the same data type or 
of compatible types for the specific operation, e.g. integer and floating 
point. 


lf 17.5 had been used in the above statement, we would call this a /iteral, 
not a constant. If a literal is other than a number, e.g. the heading of a 
report could contain “student name” and “grade”, then just as quotation 
marks are used to distinguish the text as being different from the words in 
the sentence, the text in the pseudocode should also be included in 
quotation marks. For example DISPLAY “Enter the student’s name”. 


Definition: constant 


1. A quantity or data item whose value does not change. 


2. A value that is determined by its denotation, i.e. a literal. 


Definition: literal 


A word or symbol in a program that stands for itself rather than as a 
name for something else, i.e. an object whose value is determined by 
its denotation. Numbers are literals; if other symbols are used as 
literals, it is necessary to use some form of quoting mechanism to 
distinguish them from variables. 


4 More Complex Calculations: the Order of 
Precedence 


In the pseudocode statement: 
sum := number1 + number2 + number3 


a calculation is performed and the result assigned or copied into the 
location identified by the variable sum. Note that this means that the 
original data stored in the variable sum would be overwritten. 
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Computer languages provide a range of operators for arithmetic. The 
signs that will be used in our pseudocode are: 

+ meaning add 

— meaning subtract 

* meaning multiply 

/ meaning divide 


For the moment, this is all we will use in our examples. 


The order in which a calculation is evaluated is very important, as the end 
result can differ according to which operation is given precedence, i.e. 
performed before another. The example below illustrates the potential 
problem: 


The expression a:= 4 + 3*2 could be worked out in two different orders. 


1) Add the 4 to the 3 and then multiply the result by 2  4+3=7 7*2=14 
2) Multiply the 3 by the 2 and then add 4to the result 3*2=6 6+4=10 


One method of working has resulted in an answer of 14, the other an 
answer of 10. You may have encountered this problem during your early 
mathematics lessons. 


Having rules for the order of precedence of the arithmetic operations 
avoids this ambiguity. The way expressions are worked out by a computer 
are very similar to the rules in mathematics, which are: 


e do operations in brackets first; 


° then evaluate any exponentiation (e.g. finding the cube of a number, 
work out 6 to the power of 4); 


e then any division or multiplication operation; 


e and lastly, addition and subtraction. 


Another useful way to remember the order of precedence is the word 
‘BODMAS’. This stands for: 


B — Bracket 

O - Of 

D — Division 

M — Multiplication 
A — Addition 

S — Subtraction 


Thus the correct answer for the expression a:= 4 + 3*2 is 10. If you 


wanted the addition to be performed first, you would have to write the 
expression as a:= (4 + 3)*2 and then the correct answer would be 14. 
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Exercise 2.2 [15 minutes] 


Evaluate the following expressions to find the value assigned to a: 
a) a:=6*3 + 2*(4+3) 
b) a:=3+5*(5-1) 


C) ai=3+2*(2-3) 


Study Note 


When you are testing a program and having to work through the logic 
to try and find errors, you will need to be able to substitute data for 
variables in expressions to work out the current value of a variable. 
This is similar to algebra in Mathematics, but the data used could be 
textual, not just numbers. 


Consider a:=b?-4*c+d where b=2, c=3 and d=4 
be= 224 A4*c = 4*3 = 12 


Now only add and subtract are leftso a:=4-12+4 a:= -12 


Exercise 2.3 [15 minutes] 


Evaluate the following expressions where b = 2,c =3,d=4 


a) a := b* + d(c-b) 


b) a :=c(d-b) 


c) a := b*c + c(d/b+5) 


Study Note 


It is important to understand that the code in the example performed 
the calculation first and then assigned the value to the variable named 
a. An assignment is NOT the same as the logical operator equals =, 
which we use in mathematics. To help you remember this difference 
our pseudocode uses the sign :=. 


An assignment such as total := total + score means that the number 
currently stored in the variable named total is added to the number 
currently stored in the variable named score. The result is then 
assigned to the variable total, thus overwriting the original value of 
total. 
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Exercise 2.4 [25 minutes] 


Write the pseudocode to solve the following problem: 


Input two decimal numbers and calculate and print the sum and 
product of the two numbers. Provide text prompts for the input data and 
text descriptions of the output provided. Remember to identify variable 
names and type. 


5 Control Structure 


There are three main types of instructions used in programming, 
sequence, selection and loops. 


Definition: control structure 


A syntactic form in a language to express flow of control. Common 
control structures are: 


IF ... THEN... ELSE, WHILE ...DO, REPEAT ... UNTIL, CASE 


5.1 Control Structure — Sequence 


The problems encountered so far have all included instructions which 
have to be carried out in sequence. This is the easiest type of program 
control but it is very important that the instructions are in the right order as 
the computer will obey the instructions implicitly in the order in which they 
are written. 


5.2 Control Structure — Selection 


There are many occasions where a program is required to include a range 
of alternative actions. 


Imagine that you have to write a program in which one number (a 
constant) is divided by another (a variable). The contents of the variable 
can change throughout the program and it is possible that the variable 
contains a zero, particularly if the data was entered via the keyboard and 
the operator had made a mistake. As it is not possible to divide by zero, 
the calculation should not be performed on this data. Therefore this 
condition has to be tested for. You need to instruct the computer, “if a 
zero is entered then print ‘invalid entry’, and do not do the division 
calculation”. 


This can be illustrated in the form of a flowchart. Flowcharts were used 
extensively by programmers before the advent of pseudocode. However, 
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they are still a useful tool to help programmers work out the logic of the 
sequence of instructions needed, particularly where conditions occur. 
Note that the results of the test are placed on the lines exiting from the 
decision box. It is best to write the condition in a form where the two 
choices are yes and no (i.e. TRUE or FALSE), as this matches the IF ... 
THEN ... ELSE statement. 


input the 
number 


No 
calculate 
100/number 
print the 
answer 


print 
“invalid entry? 


Figure 2.2 Checking for dividing by zero 


Another example of selection is when we need to take action based on 
user choice, as in a menu of options where the user may input alternative 
choices and, in consequence, the program must act on these choices. 


MAIN MENU 
INPUT NEW ORDERS 
AMEND ORDERS 


UPDATE ORDERS 
DISPLAY STOCK 


Figure 2.3 An example of a menu showing user choices 


All computer languages therefore, provide some means of selection. 
Usually, this is in the form of an IF statement and pseudocode is no 
exception to this. 
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We shall use the IF statement together with logical operators (Such as 
equal, less than) to test for true or false, as shown below: 


IF a= b THEN PRINT “a _ is equal to b” where the ‘=’ sign is the logical 
operator. 


The action to be taken is preceded by the word THEN, and is only taken if 
the test result is true. 


The logical operators used in our pseudocode are: 


= is equal to 

== is greater than or equal to 
<= is less than or equal to 
<> is not equal to. 


Hence, IF a <= b translates to ‘if a is less than or equal to b’. Let us now 
illustrate this with a practical example. 


In this example, the problem is to offer the user a menu of choices which 
will allow the input of two numbers and the calculation of the sum, 
difference or product of the numbers. 


Figure 2.4 shows the prompts provided for the user and the data that the 
user keys in. 


Choose one of the following: 
m for multiply 
a for add 
s for subtract 


a 


Input the numbers you want to use 


34,45 


Figure 2.4 Screen display when the program is running 
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The pseudocode solution is: 


Use variables: choice OF TYPE Character 
answer, number1, number2 OF TYPE Integer 


DISPLAY "Choose one of the following:" 
DISPLAY “m for multiply" 

DISPLAY “a for add” 

DISPLAY “s for subtract" 

ACCEPT choice 

DISPLAY “Input the numbers you want to use" 
ACCEPT number1, number2 


IF choice = m 
THEN answer:= number! * number2 Pap ant ae eae ay ead es aetna 
ENDIF 1 Note that these 
IF choice =a | Statements are indented. 
THEN answer:= number1 + number2 This indicates that they 
ENDIF , are part of the IF... THEN 
IF choice =s 1 construct. 
THEN answer:= number1 - number2 Veo Sin Sea See 7 
ENDIF 
DISPLAY answer 


end program 


Study Note 
Within pseudocode, sequencing is indicated by indentation. 


Exercise 2.5 [10 minutes] 


Amend the solution to also include divide as a choice. Identify the 
changes you will need to make to the pseudocode. Remember to ask 
yourself whether the type of variables will stay the same. 


Conditional statements can be quite complex. Some examples in English 
are: 


“If 've an assignment to finish off I'll have to do it, otherwise I'll go and 
see the football match with you next week.” 


“When the alarm goes off get straight out of bed, unless it’s a weekend in 
which case you can stay in bed a bit longer.” 


When we talk to each other we can use quite complex conditional 
sentences, but we are still able to understand the meaning. Computers 
are not quite so intelligent and we need to be very precise in the way that 
we provide instructions. You already know that in an IF statement, the 
instructions following the THEN part of the IF statement are executed if 
the condition is TRUE. An IF statement on its own however, is often not 
the neatest way of solving a problem. A more elegant and comprehensive 
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set of conditions can be created by adding an ELSE statement to the IF 
statement. 


The IF ... THEN ... ELSE statement is used to deal with a situation such 
as “A person is paid at the top level for category 1 work, otherwise his pay 
is at normal rates.” In this example, there is some processing to do when 
the expression is FALSE. This leads to the following logical statement: IF 
the work is category 1, THEN the pay rate is at the top level ELSE the pay 
rate is normal. Therefore, in pseudocode this is: 


IF work = cat1 
THEN p_rate := “top” 
ELSE p_rate := “normal” 
ENDIF 


Note that “top” and “normal” are literals. 


Exercise 2.6 [15 minutes] 


Amend your pseudocode solution to exercise 2.5 to use the IF ... 
THEN .... ELSE statement rather than a series of IF..... THEN 
statements. As a reminder, the user is provided with the menu in figure 
250% 


Choose one of the following: 


m for multiply 
a for add 

s for subtract 
d for divide 


Input the numbers you want to use 


Figure 2.5 Menu for Exercise 2.6 


Exercise 2.7 [20 minutes] 


Write the pseudocode for a program to calculate the wages of a 
salesman according to the following rules. 


The wage is calculated at a rate of 15% of sales. If the salesman has 


been with the company more than three years, he receives a loyalty 
bonus of 10% of his calculated wage. 


HINTS: What output is required? What processing is needed? 
What input data is required? 
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Exercise 2.8 [10 minutes] 


Write the pseudocode for a program to input a category of insurance 
and display details of the type of insurance available. Use the 
IF... THEN...ELSE statements. 

Category Type of Insurance 


Insurance not available 


A Insurance is doubled 
B Insurance is normal 
M Insurance is medically dependent 


Any other category entered is invalid. 


Definition: IF THEN ELSE statement 


The most basic conditional construct in a programming language, 
allowing a choice of two alternatives, depending on the truth or falsity 
of a given condition. 


Most languages also provide an IF ... THEN construct to allow 
conditional execution of a single statement or group of statements. 


Primitive languages, such as BASIC in its original form, restrict the 
facility to a conditional transfer of control, e.g. “IF A = 0 THEN 330”, 
which is reminiscent of the conditional jump provided in the order code 
of every Central Processing Unit. See also conditional. 


The CASE Statement 


Repeating the IF ... THEN ... ELSE statements a number of times can be 
somewhat confusing. A simpler (and easier to follow) construct is to use 
a selection method that specifically covers the set of alternative conditions 
that are needed. In our pseudocode this will be called a CASE statement. 


The CASE statement is frequently used for coding the choice between 
items in lists, such as those found in screen menus. The solution to 
exercise 2.8, using the CASE statement rather than IF ... THEN ... ELSE 
statements, is shown below. Note that much less code is required with 
the CASE statement to solve the same problem. 
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Definition: CASE statement 


A conditional control structure that appears in most modern 
programming languages and allows a selection to be made between 
several sets of program statements; the choice is dependent on the 
value of some expression. The case statement is a more general 
structure than the IF THEN ELSE statement, which allows a choice 
between only two sets of statements. 


Solution to Exercise 2.8 using a CASE statement: 


Use variables: category OF TYPE character 
Insurance OF TYPE string 


ACCEPT category 
DBO CASE of category 
CASE category = U 
PRINT Insurance := “not available" 
CASE category =A 
PRINT Insurance := “double” 
CASE category = B 
PRINT Insurance := “normal” 
CASE category =M 
PRINT Insurance := “medically dependent" 
OTHERWISE PRINT “entry is invalid” 
ENDCASE 
end of program 


Study Note 


Note that the last alternative statement in the CASE statement is 
OTHERWISE. This statement will be executed if no other case has 
been selected as TRUE. 


Some programming languages, such as C, use fall-through CASE 
structures and serious errors can arise if the OTHERWISE case is not 
included. 


You are advised to always include the OTHERWISE case in every 
CASE statement. 


Logical Operations 


The conditions set in the IF ... THEN statements have been concerned 
with operands being equal, but there are many occasions when the 
conditions that are to be tested will need to be extended. 


Often these conditions are linked in a logical way, and as such are called 
logical operations. In English, we might say things like: 


“If | had the time and the money | would go on holiday.” 
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The and means that both conditions must be true before we take an 
action. We might also say “Il am happy to go to the theatre or the 
cinema”. The logical link this time is or. Conditions in IF statements are 
linked in the same way. 


Definition: logic operation 


An operation on logical values, producing a Boolean result (see also 
Boolean algebra). 


The operations are denoted by symbols known as operators. In 
general there are 16 logic operations over one or two operands; they 
include AND, OR, NOT, NAND, NOR, exclusive-OR, and equivalence. 


Definition: logical type 


(Boolean type) A data type comprising the logical values TRUE and 
FALSE, with legal operations restricted to logic operations. 


Definition: logical value 


(Boolean value) Either of the two values TRUE and FALSE that 
indicate a truth value. Although a single bit is the most obvious 
computer storage structure that can be applied to logical data, larger 
units of store, such as byte, are frequently used in practice since they 
can be addressed distinctly. 


Conditions linked with an AND, only result in an action when al! conditions 
are true. 


For example: IF a>b AND a>c THEN display “a is the largest”. 


Conditions linked with an OR, lead to an action when either condition is 
true. Let us look at a practical example to illustrate this. 


The problem is to input an examination mark and test it for the award of a 
grade. The mark is a whole number between 1 and 100. Grades are 
awarded according to the following criteria: 

>= 80 distinction 

>= 60 and < 80 merit 


>= 40 and < 60 pass 


< 40 fail 
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Pseudocode solution: 


Use variables: mark OF TYPE Integer 
ACCEPT mark 
IF mark >= 80 
THEN display “distinction” 
ENDIF 
IF mark >= 60 AND mark < 80 
THEN display “merit” 
ENDIF 
IF mark >= 40 AND mark < 60 
THEN display “pass” 
ENDIF 
IF mark < 40 
THEN display “fail” 
ENDIF 
end program 


Exercise 2.9 [10 minutes] 


Using the pseudocode solution provided above, complete the diagram 
shown in figure 2.6 to show the solution to the problem of inputting an 
examination mark and testing it for the award of a grade of distinction, 
merit, pass or fail. 


The diagram has been completed for the grades of distinction and 
merit. You will be completing the algorithm by adding the part for a 


grade of pass or fail. 


mark 
>= 80 


Yes 


Figure 2.6 Diagram showing part of the algorithm for the problem “Input an examination 
mark and test it for the award of a grade” 


V1.1 2-21 


Chapter 2 — Variables, Control Structures and Calculations Programming Methods 


5.3 


2-22 


Exercise 2.10 [10 minutes] 


a. Amend the code in the previous pseudocode solution to the 
problem of inputting marks and allocating grades. Include the 


additional criteria that if a mark is <40 and >=30 the grade is a 
referral. But this time use IF ... THEN ... ELSE statements instead 
of the fourlF .. AND ... THEN statements. 


b. Draw a diagram to show your solution. 


Control Structure —- Loops 


The power of a computer lies in its ability to do things time and time again, 
very quickly, without becoming tired, bored, or inaccurate. The sequence 
of instructions which is repeated is called a loop. They can also be 
referred to as iterations, but strictly speaking an iteration is more complex 
than a simple loop. 


The instructions needed to process all the payroll records on a file and 
then print pay slips are an example of a loop: 


Read the payroll details for one employee 


Until no 
more Process the data to calculate the pay 


records 
Write the new pay details to a file 


print all the payslips from the file 


Figure 2.7 Running the payroll to illustrate ‘looping’ 


The instructions are in the form of a loop and these instructions will be 
repeated for every record on the payroll file. The loop will terminate when 
end-of-file is reached. This means that the instruction after the loop will 
then be executed and, in this example, the pay slips would then be 
printed. 


An iteration is a repetition of a sequence of instructions where the results 
from one pass of the loop are used as input to the instructions in the next 
pass of the loop. 


An example of an iteration would be if, during the running of the payroll 
loop above, a running total was kept of e.g. tax to be paid. Each time 
through the loop, the tax for the current employee would be added to the 
old current total, which would then be used as input in the following loop 
for the next employee. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 2 — Variables, Control Structures and Calculations 


V1.1 


T 


T 


he instructions which make up the loop have two main elements: 


a sequence of instructions to be performed each time the loop is 
executed; 


an indication of when to finish executing the loop. This part will 
obviously contain a logical test as a certain condition will need to be 
checked. This checking can be done at different stages of the loop, 
e.g. at the beginning or at the end, depending on the type of 
instruction used to control the iteration. 


here are three constructs for loops in our pseudocode: 
REPEAT UNTIL which tests at the end of a block of code, so the 
sequence of instructions is always executed at least once. 


WHILE which tests at the start of a block of code so it is possible 
that the instructions in the loop may never be executed. 


FOR loop which is controlled by a count given from known 
conditions. 


Definition: loop 


A sequence of instructions which is repeated until a prescribed 
condition, such as agreement with a data element or completion of a 
count, is satisfied. See also do loop. 


REPEAT an action or block of actions UNTIL (a true condition occurs). 


This type of loop is often used for processing input data which will have 
a coded item at the end of the data to indicate that the end of the data 
has been reached. 


For example, percentage examination marks could be terminated by a 


n 
a 


umber outside the range of O to 100. Consider a program segment to 
llow entry of a number in the range O to 100. 


Pseudocode solution: 


Use variables: number of type Integer 

REPEAT 
DISPLAY “Enter a number between 0 and 100" 
ACCEPT number 

UNTIL number <0 or number > 100 

end program 
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display 
instructions 


Figure 2.8 Algorithm for only accepting a number between 1 and 100 


Another common use is in selection from a menu of options, as in the 
following example. A survey has been carried out to discover the most 
popular sport. The results will be typed into the computer for analysis. 


A pseudocode solution follows: 


Use variables: letter OF TYPE Character 
athletics, swimming, football, badminton OF TYPE Integer 
REPEAT 
DISPLAY “Type in the letter chosen or Q to finish" 
DISPLAY “A: Athletics" 
DISPLAY “S: Swimming” 
DISPLAY  “F: Football" 
DISPLAY “D: Badminton" 
DISPLAY ‘“Q: end data:" 
ACCEPT letter 
IF letter = 'A’ 
THEN athletics := athletics +1 
ENDIF 
IF letter = 'S' 
THEN swimming := swimming + 1 
ENDIF 
IF letter = 'F’ 
THEN football := football + 1 
ENDIF 
IF letter = 'B' 
THEN badminton := badminton + 1 
ENDIF 
UNTIL letter = 'Q' 
DISPLAY “Athletics scored " athletics “votes” 
DISPLAY “Swimming scored “ swimming "votes" 
DISPLAY “Football scored " football “votes” 
DISPLAY “Badminton scored " badminton “votes” 
end of program 
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Study Note 


Note the statement swimming: = swimming + 1. 


To remind you, this means that the current data stored in the variable 
‘swimming’ is added to 1 and the answer is placed back in the variable 
‘swimming’. 


It is assumed that the variable ‘swimming’ will contain zero at the 
beginning of the program. 


When this type of statement is in a loop, and the loop can be executed 
several times, it is important that the variable contains a zero at the 
beginning of each count. 


It is good practice to make all running totals zero before the beginning 
of the block of statements containing the statement to add to the 
running total. 


Exercise 2.11 [5 minutes] 


What difference would it make to the way the program worked if the 
REPEAT instruction was moved to immediately before the instruction 
ACCEPT letter? 


Exercise 2.12 [25 minutes] 


A number of student results are to be entered using the keyboard and 
the results are to be printed. Each student will have a variable number 
of results, the termination of which is represented by the value 999. 
Although the number of marks for each student is variable, these will 
usually be between 1 and 5, so there will be no problem printing the 
marks for each student on one line. The termination of the whole group 
of student names will be represented by XXXX. 


Write the pseudocode to calculate the total score for each student, print 
the student name, each of the student’s marks and the total. An 
example of output might be: 


Student Name Marks Total 
John Smith 34 56 72 162 


where the data typed using the keyboard for two students would be: 


John Smith, 34, 56, 72, 999, William Holland, 67, 43, 12, 34, 999, 
XXXX. 
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WHILE a True Condition Occurs DO an Action or Block of 
Actions 


In the second type of iteration, we test for the terminating condition at the 
beginning of the code block. 


WHILE (a true condition) 
STATEMENT 
or STATEMENT BLOCK concluded with 


ENDWHILE 


A statement BLOCK means a group of statements needed to complete a 
particular process required. The statements are indented to indicate that 
they are ‘together’. 


For example: A program segment to display each character typed using a 
keyboard until the character q is entered. 


Use variables: letter OF TYPE Character 
ACCEPT “letter” 
WHILE letter <> ‘q' 
DISPLAY “the character you typed is ", letter 
ACCEPT letter 
ENDWHILE 


Study Note 


In this example, the statement block contains two statements which are 
always executed together, a DISPLAY statement and an ACCEPT 
statement. 


Exercise 2.13 [10 minutes] 


a. Draw the diagram to show this algorithm. 


b. Why are there two ‘ACCEPT letter’ instructions in the solution? 
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Definition: DO-WHILE loop 
A form of programming loop in which the condition for termination 
(continuation) is Computed each time around the loop. There are 
several variants on this basic idea. For example, Pascal has: 
WHILE <condition> DO 
BEGIN 


<statements> 


and also 
REPEAT 
<statement> 
UNTIL <condition> 


The first is a WHILE loop and the second is a REPEAT UNTIL loop. 


Apart from the obvious difference that the first specifies a continuation 
condition while the second specifies a termination condition, there is a 
more significant difference. 


The WHILE loop is a zero-trip loop, i.e. the body will not be executed at 
all if the condition is false the first time around. In contrast, the body of 
a REPEAT-UNTIL loop must be obeyed at least once. 


Similar constructs are found in most languages, though there are many 
syntactic variations. 


See also DO loop. 
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Exercise 2.14 [15 minutes] 


Consider the flowchart, shown in Figure 2.9, of a part of a program to 
accept a letter via the keyboard, and then answer the following 
questions: 


a) When this part of the program terminates, what data is stored in the 
variable letter? 


b) How many times will the loop be executed when the user tries to 
enter the data pseudoqode? 


c) Describe the effect of entering the dataUu DdOOQqUuOO. 


d) Complete the following table to indicate the contents of the variable 
letter and the character displayed for each pass of the loop, when 
the letters pS e ud oq are typed. 


Displayed character Contents of letter 


before loop p 


first pass iS 


second pass 


third pass 


fourth pass 


Figure 2.9 Flowchart for algorithm of typing letters via the keyboard 
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REPEAT an Action or Block of Actions FOR a Number of 
Times 

The third type of loop, which we shall use when the number of repetitions 
is known in advance, is a FOR loop. This, in its simplest form, uses two 
values of a variable, the starting and the final condition for the action. The 
variable is incremented on each iteration until it reaches the value 
identified as the final state. 


The primary difference between the Do/While statement and the 
Repeat/Until statement, is the treatment of the loop exit condition. The 
Do/While statement exits on false, the Repeat/Until statement exits on 
true. 


The pseudocode syntax will be: 


FOR (starting state, final state, increment) 
Statement 


Statement 
ENDFOR 


The loop is controlled by a variable which is a simple count and is 
assigned a starting value, e.g. n := 1. Each pass through the loop involves 
a series of instructions such as: 


e compare the current value of count to the final state. If the count 
has passed the final state (i.e. count > final state) then the loop will 
terminate, otherwise continue; 


e add the increment to the count. 


Using these terms, the pseudocode definition for a FOR ... ENDFOR loop 
will be: 
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These are comments - for the 

programmer's benefit and are not treated 

as program instructions. Every programming 
language will provide some way of denoting 
comments. Note the -- at the beginning of each 
comment line in this pseudocode. 


FOR (n:= a, n= b, +1) 
command sequence 
=i oS, a cA acl hk 


-- where a is the starting point, ge 


-- +1 is the increment and 


-- bis the final value. This means that the loop will 


-- continue to be executed while n is less than or equal to b. 


For example: a FOR loop to control an iteration to read a pair of numbers 
and find the average of each pair where a total of 50 pairs of numbers are 
to be input would need: 

e a variable to control the loop e.g. count; 

e a starting state of count:=1; 

e a final value for count, in this case count = 50; 


e an increment of 1 (that is 1 would be added to count each pass of 
the loop, count := count + 1). 


Example: 


1 This fragment of code will produce the 
FOR (n:=1, n=3, + 1) | output: 


I 
I 
I 
DISPLAY “loop",n <&——_—_—_—— loop 1 when n=1 ! 
ENDFOR | loop 2 when n=2 
1 loop 3 when n=3 ! 

I 
I 


Study Note 


Notice how the variable used to control the loop has also been used in 
the DISPLAY statement. However, if the contents of the variable had 


been changed by an instruction within the loop, this would have 
interfered with the behaviour of the loop. Do not interfere with the 
contents of variables which are controlling loops and be aware of any 
other use for the counting variable, apart from controlling the FOR loop. 


For example: write a program to calculate the sum and average of a 
series of numbers. The user will decide how many numbers are entered. 


The algorithm (solution) for this program is shown as pseudocode and 
also in the form of a flowchart (figure 2.10). 
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The pseudocode is: 


Use variables: n, count OF TYPE Integer 
sum, average, number OF TYPE Real 

DISPLAY “How many numbers do you want to input?" 
ACCEPT count 
FOR (n:=1, n= count, +1) 

ACCEPT number 

sum ‘= sum + number 
ENDFOR 
average:= sum/count 
DISPLAY “The sum of the numbers was “, sum 
DISPLAY “The average of the numbers was", average 

end of program 


display 
instructions 


n 


Figure 2.10 Flowchart showing algorithm for program to calculate the sum and 
average of a series of numbers using the FOR ... ENDFOR statement 
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Definition: DO loop 


A counting loop in a program, in which a section of code is obeyed 
repeatedly with a counter taking successive values. 


Thus in FORTRAN: 


DO 10 1 = 1,100 
<statements> 
10 CONTINUE 


causes the <statements> to be obeyed 100 times. The current value of 
the counter variable is often used within the loop, especially to index an 
array. 


There are many syntactic variants: in Pascal and Algol-related 
languages the same basic construct appears as the FOR loop, e.g. 
FOR | := 1 to 100 DO 
BEGIN 
<statements> 
END 


This kind of loop is a constituent of almost all procedural programming 
languages (except APL, which has array operations defined as 
operators in the language). 


See also DO-WHILE loop. 


Exercise 2.15 [10 minutes] 


Amend the previous pseudocode solution to input a student name and 
marks as in exercise 2.12, but this time there are always three 
examination marks for each student. You are required to provide the 
average examination mark for each student. Forty five students took 
the exam. The examination marks input are whole numbers i.e. 
integers. Use a FOR ... ENDFOR construct. 
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Exercise 2.16 [5 minutes] 


What type of loop construct would you use to do the following? 
a. Perform a series of calculations 10 times. 


Input a series of numbers terminated by 999999. 


Calculate the answer to the formula y:= x” + 2*x, for x in the range 
100 to 199. 


d. Only accept numbers via the keyboard which are _ positive, 
terminating the procedure when a number is encountered outside 
this range. 


e. Calculate the answer to the formula y:= x* + 2*x, for even numbers 
in the range 100 to 199. 


Summary 


In this chapter we have covered: 


e Variable types and names. 

e Calculations and order of precedence. 
e Control structure 1 — sequence. 

e Control structure 2 — selection. 


° Control structure 3 — loops. 


You will have become familiar with variables. You should understand how 
variables work and why they are so necessary. You will have practised 
identifying the different types of variables and be used to declaring the 
different types of variables needed for any solution to a problem. You are 
able to solve simple problems using arithmetic operations and are aware 
of the importance of obeying the rules of precedence. 


You have studied the three basic constructs of Structured Programming 
and the different control structures that are available. You should 
understand how to control loops and provide alternative actions as a 
result of testing a logical condition. You should also have learned how to 
represent the algorithm as a diagram and understand pseudocode 
programs. 


You are now ready to develop more complex programs and the next 


stage is to consider methods to help create them successfully through an 
examination of complex data types. 
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7 Self Study 


This chapter is concerned with problem solving, and as such you are 
advised to work through the chapter in the order provided. Providing 
correct solutions to the problems in the form of a diagram and 
pseudocode is useful preparation for solving more advanced problems. 
You are advised to continue to practise these skills until you feel confident 
that you have mastered the art of diagramming and_ producing 
pseudocode. Providing algorithms to problems is the main task of a 
programmer; writing in different programming languages is simply a 
matter of translation — e.g. how do | say that in ‘C’? 


7.1 Variables, Types and Names 
The following is a summary of the pseudocode introduced so far: 
1. Keywords are defined in upper case e.g. READ, ACCEPT, PRINT. 


2. Variables and values are defined in lower case e.g. sum, 
next_number. 


3. Types are defined with an upper case first letter, e.g. Integer, 
Character. Pseudocode has five basic types of data, Integer, Real, 
Character, Boolean, String. 

4. The assignment symbol := means ‘takes the value of’. 


5. Keywords allow data to be input to and output from the program. 


e READ to input data from a backing store, e.g. READ 
next_number. 


e WRITE to output data to a backing store, e.g. WRITE 
stock_record. 


e ACCEPT to input data from a keyboard, e.g. ACCEPT 
user_response. 


e DISPLAY to output data to a screen, e.g. DISPLAY error_report. 


e PRINT to output data to a printer, e.g. PRINT “The answer is ”, 
sum. 
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7.2 Calculations and Order of Precedence 


Self Study 1 [30 minutes] 


Write the pseudocode for a program which will calculate the total 
surface area of the walls and the surface area of the ceiling, when the 
dimensions of a room are entered. Doors and windows can be ignored. 
Provide text prompts for input and text descriptions for output. 


Self Study 2 [30 minutes] 


Write the pseudocode for a program which will calculate the amount of 
tax and total cost including tax, when the price, quantity and description 
are entered for an item bought at a wholesale builders merchants. All 
prices quoted at a wholesalers would be excluding tax. Assume that a 
tax rate of 17.5% will apply. Provide text prompts for input and text 
descriptions for output. 


Self Study 3 [30 minutes] 


Write the pseudocode for a program which will calculate the gross 
amount of pay earned when the hours worked and wage per hour are 
entered for an employee. Employees’ hours are always rounded up to 
the nearest half hour. Income tax of 20% is to be deducted from the 
gross pay to calculate the net pay. Provide text prompts for input and 
text descriptions for output. 


7.3 Control Structure: Sequence 


V1.1 


Self Study 4 [15 minutes] 


The following instructions are needed to complete the task “making a 
cup of tea”. BUT the sequence is wrong. 


Making a cup of tea 

fill kettle 

get teapot 

put tea in teapot 

pour water in teapot 

add milk to cup 

boil water 

pour tea from teapot into cup 


0 
1 
2 
3 
4 
5 
6 
7 


a. What would be the result if the above instructions were obeyed 
exactly? 


b. What should the order be? 


2-35 


Chapter 2 — Variables, Control Structures and Calculations Programming Methods 


Self Study 5 


Identify the sequence of instructions to connect up a PC. 


Self Study 6 [15 minutes] 


Identify the sequence of instructions to get up in the morning, from 
waking up to leaving home. 


7.4 Control Structure: Selection 


Pseudocode definition: IF ... THEN ... ELSE 
Conditional branching using the IF ... THEN ... ELSE construct 


IF condition 
THEN 
command sequence 1 
ELSE 
command sequence 2 
ENDIF 


Multiple branching (or selection) with nested IFs 


IF condition 1 ete: 

THEN The pairs of THEN and ELSE 
appear directly under each other. 
The command sequence can be a 
ELSE IF condition 2 number of statements. In this case 

THEN indent the first one on the line 
below the THEN or ELSE as shown 

command sequence 2 here. 

ELSE IF condition 3 Where only one statement is 
needed after a THEN or ELSE, it is 
often written on the same line but 
the THEN and ELSE are lined up. 
The IF ... THEN ... ELSE construct 
should always have an ENDIF to 
indicate the end of it. Some 

rogramming languages require an 
Steel Gclulinel erga 0s (8 ENDIF for every pair of THEN and 
ELSE. 
The indentationsindicate the 
sequence of the instructions. 


command sequence 1 
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Pseudocode definition: CASE structure 
The CASE structure is shown in pseudocode as: 


DO CASE of index 
CASE index condition 1 
command sequence 1 
CASE index condition 2 
command sequence 2 


CASE index condition N 
command sequence N 
OTHERWISE 
default command sequence 
ENDCASE 


Self Study 7 [15 minutes] 


Amend the code to the pseudocode solution to Exercise 2.6 (on page 
2-17) by using the CASE statement rather than IF...THEN... ELSE 
statements. 


Self Study 8 [30 minutes] 


Amend the code to the pseudocode solution to Exercise 2.7 (on page 
2-17). The company have changed the formulae for calculating wages. 


During the first year the wages are 10% of sales. When the sales staff 
have completed one year they have a raise to 12.5% of sales, after two 
years it changes to 15% and on completing three years to 17.5%. 
Those staff who have been with the company four or more years have 
a wage of 17.5% of sales, plus a loyalty bonus of 10% of the calculated 
wage. 


Use the CASE statement rather than IF...THEN ... ELSE statements. 
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Self Study 9 [30 minutes] 


An index is set to the numbers 1, 2, 3 or 4 according to a given job 
code, as described in the table below. 


Index Meaning 
four-person job, expected to last three days 
two-person job, expected to last four days 
one-person job, expected to last five days 
two person job expected to last seven days 


Each person is paid £50 per day. Using the CASE statement write the 
pseudocode that will process these conditions and calculate the 
expenditure for each. 


7.5 Control Structure: Loops 


Pseudocode definition: REPEAT...UNTIL 


REPEAT 
command sequence 
UNTIL condition 


Pseudocode definition: WHILE ... ENDWHILE 


WHILE condition 
command sequence 
ENDWHILE 


Pseudocode definition: FOR ... ENDFOR 


These are comments - for the 

programmer's benefit and are not treated 
FOR (n:= a, n= b, +1) as program instructions. Every programming 

language will provide some way of denoting 


command sequence comments. Note the -- at the beginning of each 
ENDFOR comment line in this pseudocode. 


-- where a is the starting point, ge 


-- +1 is the increment and 
-- bis the final value. This means that the loop will 


-- continue to be executed while n is less than or equal to b. 
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Self Study 10 [60 minutes] 


Consider the following program to print out a student’s results, 
including average and grade for a variable number of students in a 
class, when input via the keyboard. 


Use variables : n, exami, exam2, exam3, count OF TYPE Integer 
average_student OF TYPE Real 
name OF TYPE String 
student_id_code, grade OF TYPE Character 
DISPLAY “Enter the number of students" 
ACCEPT count > variable not declared 
PRINT “Student ID Code Student name Examl Exam2 Exam3 Average Grade" 
FOR (n:= 1, n= count, +1) 
DISPLAY “Enter the student's ID code followed by name" 
ACCEPT student_id_code, name 
DISPLAY “Enter the three examination marks for ", name 
ACCEPT exam1, exam2, exam3 
average_student:= (examl + exam2 + exam3) / 3 
IF average_student < 30 
THEN grade:= “Fail” 
ELSE IF average_student < 40 
THEN grade:= “Referral” 
ELSE IF average_student < 60 
THEN grade:= “Pass” 
ELSE IF average_student < 75 
THEN grade:= “Merit" 
ELSE grade:= “Distinction” 
ENDIF 
PRINT student_id_code, name, exam1, exam2, exam3, average_student, grade 
END FOR 
end program 


Two amendments are required: 


The number of students entered via the keyboard is not validated and 
is used to control the FOR ...ENDFOR loop. Make sure that only a 
number in the range of 1 to 50 can be input and if an error occurs, 
report it to the user and ask for the number again. 


The user has made the following request: 


“Could you provide me with the average for all the exams for the whole 
class please. A line written along the bottom of the report with a side 
heading of ‘Average exam mark’, would be great.” 


a) Draw the diagram to represent the algorithm for the calculation of 
the average exam mark for the whole class. 


b) Identify the changes that will be required to the pseudocode 
provided. 
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Self Study 11 [20 minutes] 


There was some confusion in dealing with the program amendments in 
Self Study 10. The teacher did not make it clear that the average for 
EACH exam for the class was required. So an average mark for the 
class is required for each of exam1, exam2 and exams3, in addition to 
the overall average for all the exams. 


Identify the changes that will be required to the pseudocode solution to 
Self Study 10. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Chapter 3 


Data Analysis and Problems 


V1.1 


LOMING OUICOMES cnc tetas diene diceeveusiauin Acrenaraadeainuiaunatiaania’d 3-3 
IFITROCIUICUION A cxaegezzecaszerzsedvpervqeconecopeaeseeseesgedontesgenvpesesecvaceugisvwedeseavqueonedsueavace 3-3 
Program Development PIOCeSSs ............ccccsssscccccseseececeeeseceseeseseeessseeseeeeesaes 3-3 
cL, METRE OCIIG UO cscs tuck ec hal acai ale iat Dat ile ld a lon tale 3-3 
S22” i PROGKAM SHECHICAOMN: orssacticedtcncsdserstittccataleeitustaanicaieesednailentes 3-5 
FRECMIMGIMIGIIIS Aaly SIS sce ccxocevetence ces ccevuceetercnceudubvucendenrccesaueoeceudenseteas 3-6 
DOSIOM cease vtesscssssasss ls terseshateszectsteszecboteszemoreazertalensuciocsnnersetearcasteoicet 3-7 
COdINGs auAnavataualatiauauandiawaicdsidsianawaudtianawa lcs: 3-8 

TO SHING Ser decosesevuccust ean ahenastvadeususlodeugia eden sieuabevcvnrataudievelangenrohaneluweden del 3-8 
Implementation ANd SUPPOIt ...............ccceeecccccesccesssssseeeeceeesseeeeseeeeees 3-9 

SUC CLIPS DIAC AUN ove or ex cas ee esos cna cos a0 ess eh ee ens dC ene ob ee cere aC eases ee me cere ec eeee 3-10 
cL» * | MAWOCUICTION crnetosnactianedorties fo edor nner ccs hon sone as: 3-10 
4.2 Data Structure Diagrams................cccccccccssssssssseseseeccesssecessssssereceeens 3-10 
ecb ¢2 14) &] (coed Reseree trs temreeny ae ee ee Te rr Aeree weEe wree ere ee ey eer 3-11 
EXOMDIG 2 ci iatidsesditniaticendtiaieeiamidtiatniada he msaiese canis 3-12 

FE EIU OS sac cerns ec Greets Ga acne ere Gent ee me ee cereus cme oma ee 3-12 
(Olistelalic( gem lal celdanrctt(e] a hmewemmeerenessntter teen sclera tentie eitenel ree tateneienetn ree erry rtrnrt 3-13 
Boks NES ceinedencuescaevedcuce cuawedevcaruedamantadevcanedewtunnedaennundencen eheutunm boca 3-13 
5.2 Designing Record Structures. .............cccccsssscccccsssececeeesseeeesseeseeesesees 3-15 
5.3 File Structures ON & COMPUTEL................ccseeccccccssssseeeseeeseeeesseeseeeees 3-16 
Fixed and Variable Length ReCOmrds...............ccccssssccccsssseceeccseseeeeeseeseeesenees 3-18 
6:1 » Fixed Length R@COmdS:.icc.ncacanacanauannuancuanciuadasiness: 3-18 
eA NS viz ca eden wad eens nlc ev accdat tu nodes oa puso nou usaken oe preden oun adaneunucdsver 3-18 

62> “Valla ble Lengun RECON wccu cieurcuasctiecnaietscsrasatvvcneat vanactivararieusoredss 3-19 
Pihlaly Siig AMG PTOI sericcssCerccceudeveeus ee avséue cCovsGnvedevecux etavsdeyacewscoverecacat a acets 3-19 
Designing a Top-down Modular Program. ..........ccccccccseseceeeeeeeeeeeeeeeeeseeeeees 3-22 
Examples of Initialisation ROUtINGS ..............::ccccceeeeeeeeeeeeeeeeeeeeeeeeeees 3-22 
Examples of ProceSsing TaSKS ...........ccccccssssssssssseeeecceesseeseeeeanseeeess 3-22 
Examples of CIOSING TASKS ...........cccccsceecccceeseeeeeceeseeeeeeeseeeeeeeesaneees 3-23 
EXAMNDIG IT co(cttoodesscessuevsca setetezevscenstasiccastessecusiesMenssesreauatessdevezcosdess 3-24 
PECANS 2 etshuieuse hoes oacsa se itu acannon Gnas eeu ons seat an, 3-24 
PEI NO sure hiwcasvindévcuinadecunuedoncucuadewceniedensusnedineunundavcunnedeeunedaicueres 3-25 
Determining the Structure of a Program from a Given Specification ........ 3-26 
Ol 2 SPHOGTAM SDSCINCAUION = GacacespcrstanceGes de naeie tceruces aun clrs certncnceunvatratorvees 3-26 
co] (c] og Neeeeereere artery: terre vrtrrer-vreeter/toteeer Verrter rewvrrtrtrer Srereer rr eerer errr 3-26 

SICD 2. awAnavatauanvduanagandvavandudvasaudvdvavauaws 3-27 

ESI esau cao uc wlan oe dew ge sweden Cuedaved arevevttuedeusuenedasey tuedeute evedeu ns ueduies 3-28 

AS] [| 6 2: nnn ne ne Re ee ree ne ere eee eer er eat 3-28 

3-1 


Chapter 3 — Data Analysis and Problems Programming Methods 


10 Structured Programming Diagrams. ............::ccccceceeeeeeeeeeeeeeeeeeeeeeeseaeeeneeeeees 3-29 

BOA AUC CULONN ae es es ct ods See av crew eteonien op coos vec sedis oi pede Seeds oun ade vee vodatee 3-29 

EEX EAIIG AL 23: jn csaiatcuniewsttidseunicud oun taieudiatudiaantaeeienaieubieantdatianticaianciaunt 3-30 

FEU NOLS. oa sss eetn lem ne Sass oan oslen osennd eu toccnlad anton dnion sued seed on ncdencnesdandenetes 3-31 

10.2 Establishing the Program Structure.............ccccccecceeeeeeeeeeeeeeeeeeeeeeees 3-33 
Example 1: Using One Input File to Produce One Output File or 

PRE DONG tissensvcuchauevecctaievestnuaectawnugua me nawan Suasaulnducwae ston cusdeue. cloves cucemces 3-33 

WL CRC CaS a cseesdct cuted cacaeeaenteraunicatiaes atid aie aunteatidee tia tania 3-37 

gs Bs HY doa Oye dT] a pemememere mitre mere eetae anti crsentersery ent om trae ee rset ental seprra erent 3-37 

11.2 Finding Classes and Considering SCenariOS..............cccccceceeeeeeeeeeees 3-39 

11.3: (MeiOd sasccnnavauaciacatauacnnAnaccsandnacasauanacawAaws 3-40 

M2 SS UR ENE cst ae Sh a Sah Sa Sik Sl ek al eck dl ec Sat ck tee Sek eee Sk ade Sack ade ei 3-42 

WS = SCF SOY ice eescce i cuvecee ieee eds tu cea ca dered veurn aces eumriGuateneveus summecucieusversdunvuins termes 3-43 

13.1 Program Development PIrocess...............::sssssseseecccessceessseesceesseeeeans 3-43 

13.2 Organising INfOrMALION .............cccceeeeeeeeeeeeeeeeeeeeeeeeeaeeeeeeeeeeeeeeeeeaaaees 3-43 

13.3 Structured Programming Diagrams .............cccceceeeeeeeeeeeeeeeeeeeeeeeeaeees 3-44 

Two Input Files and One Output File .............. ccc eeeeeeeeeeeeeeeeeeeeeeeeeees 3-44 

Progra SHeCHiCAtON cissi vices de eaveekiaecn ia oiavesie ass 3-44 

eration COMMON MISE 0. stscesstetgecestohaucsgtedpesestetaccosielseessteesnstelgueeneets 3-49 

THANG ACHOINILISL a nwom iets hee Cite Can Sus Cee Cres Ra 3-50 

3-2 © NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 3 — Data Analysis and Problems 


1 Learning Outcomes 


At the end of this chapter, you should be able to: 


e Understand the software development process. 

e Create structure diagrams for structured programs. 
e Understand the object-oriented environment. 

e Explain the different variable types. 

e Organise information. 

e Analyse problems. 


2 Introduction 


This chapter is concerned with program design. The software 
development process is touched on to set the scene, but is addressed in 
more detail in Chapter 8 ‘Implementation’. You will be introduced to data 
structures and their importance in program design, and will be creating 
data structure diagrams to aid comprehension of the program design 
process. 


Diagrams and tables have been widely used to help in the explanations. 
It is important to study the diagrams carefully, as much information is 
portrayed through them. They are a form of communication between 
programmers and designers; you need to be able to ‘read’ them as easily 
as a page of writing. 


3 Program Development Process 


3.1 Introduction 


There are many different types of programs which vary in complexity from 
simple ones written as a hobby, to complex programs written by 
professional programmers, which form part of a large system. Examples 
which illustrate the range of types and complexity of programs are: 


e A suite of programs for a production control system in a large 
manufacturing company. Stock control and ordering supplies, as well 
as the invoicing system, could be handled by a range of different 
programs all accessing and updating the same data files and being 
treated as one system. A team of programmers would be involved in 
the development process, all working on different programs within the 
system. Systems analysts would be responsible for the analysis and 
design of the whole system. 


e A simple database program which records the movement of stock in 
a small video hire shop. One programmer could be responsible for 


V1.1 3-3 


Chapter 3 — Data Analysis and Problems Programming Methods 


3-4 


the development of the program and, although the program will 
access and update data files, it is clearly not as complex as the 
previous example. 


e A program to perform statistical analysis on the results of an 
experiment. The data in this case will be numerical values and the 
results or output will be in a form that is understood by the 
researchers, but not necessarily by anyone else. 


e A program to display the contents of a database via the web and 
process orders and payments. This program will have to handle 
input data from the database in addition to input from users, and it is 
very important that the output from the program, in terms of the 
screen displays, are understood by all users. 


e §=6An interactive multimedia program which may be informative, e.g. an 
encyclopaedia, and thus contains a large information base, or a 
game which interacts with user actions. The functionality of these 
programs is linked to navigation, allowing the user to initiate a variety 
of sequence paths through the program. They also involve accessing 
and displaying multimedia assets such as text, graphic images, audio 
and video files. Timing is a functional issue for the programmers. For 
example, a video must appear in the correct place on the screen and 
play when the user clicks on the play button. The sound and 
appearance of graphic assets must be synchronised (happen at 
exactly the same time). 


The programs identified above, although seemingly very different, all 
process input data to produce output. Thus, before writing a program, the 
programmer needs to know: 


e the output required by the user — this can be in the form of screen 
displays, printouts or updated files; 
e the processes necessary to produce this output for the user; 


e the data, or input, to be processed to create the output required by 
the user. This could be in the form of data files or input from the user 
via the keyboard or mouse. 


Definition: programmer 


A person responsible for writing computer programs. See application 
programmer, systems programmer. 


Definition: systems analyst 


A person responsible for the development of an information system. 
Systems analysts design and modify systems by turning user 
requirements into a set of functional specifications, which are the 
blueprint of the system. 
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Definition: systems analysis 


The analysis of the role of a proposed system and the identification of a 
set of requirements that the system should meet, and thus the starting 
point for systems design. The term is most commonly used in the context 
of commercial programming, where those involved in_ software 


development are often classed as either systems analysts or 
programmers. The systems analysts are responsible for identifying a set 
of requirements (i.e. systems analysis) and producing a design. The 
design is then passed to the programmers, who are responsible for the 
actual implementation of the system. 


Program Specification 


The details of the specification will be discovered during the analysis and 
design stages of development. The program development process 
described in this section refers to the development of professional 
software. 


Definition: specification 


A formal description of a system, or a component or module of a system, 
intended as a basis for further development. 


The expression of the specification may be in the text of a spoken 
language (e.g. English), in a specification language (which may be a 
formal mathematical language), or in diagrammatic form, illustrating the 
stages of the methodology, and using a diagrammatic technique. 


Characteristics of a good specification are that it should be unambiguous, 
complete, verifiable, consistent, modifiable, traceable, and usable after 
development. 


Definition: specification language 


A language that is used in expressing a specification. It has a formally 
defined syntax and semantics, and its design is based on a mathematical 
method for modelling or defining systems. 


Definition: diagrammatic technique 


A style of analysis or design that relies primarily on the use of diagrams 
(as opposed to text or databases). The advantage is the direct appeal to 
users, the disadvantage is the limitation of two dimensions. 


The program development process is part of the software lifecycle. This 
is addressed in detail in Chapter 8 ‘Implementation’. The software 
lifecycle can be structured in more than one way, but it is characterised 
by the following phases: 
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e Requirements analysis. 


e Design. 
e Coding. 
e = Testing. 


e Implementation and support. 


Each of these phases is briefly described below, and the documentation 
produced at each phase is identified. 


Requirements Analysis 


At this stage, an accurate and complete set of client and user 
requirements is produced to determine the characteristics of an 
acceptable solution. This information is obtained mainly from direct 
interviews with the client and, if possible, current and future users of the 
system. 


e The client is the person or organisation who is paying for the software 
to be developed. 


e The users are the people who will be using the completed software. 


The process of identifying the users of a program written for internal use 
in a particular company will necessarily be different to that of identifying 
the users of a piece of software written for the market place. 


A requirements analysis specification (a document) is produced, and this 
will contain the following information: 


e The proposed system or solution, which has been agreed by the 
client and developer. 


e A list of the existing tools, new tools required, facilities and people 
available for developing the solution. 


e A schedule for the next stages of the project, including the 
deliverables for each stage. 


Deliverables are the products of the different phases throughout the 
development process. They are provided for the client by the developer, 
as evidence that progress is being made according to the requirements 
analysis document. They may be written documents and diagrams 
produced during the analysis and design phases, sample screen layouts 
or sample reports. 


The objective of the requirements analysis stage is to define, in detail, a 
solution that will fully meet the client and user requirements. A systems 
analyst will be responsible for this if the program is large, or if there are 
many programs in the system. It involves translating the requirements 
identified into terms that can be understood by the system designers, 
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programmers, and testers. The end product of this stage will be a 
program specification which will include a description of: 


e the inputs to the process; 
e the operations the system performs for each input; 
e the output obtained for the corresponding input. 


If there is more than one program in the system, the document is called a 
system specification. 


Design 


The design stage describes how the solution will be built to satisfy the 
requirements specified at the previous stage. The final set of programs 
will be produced in accordance with this description, so it has to be a 
detailed, technical and logical definition of the final system. Where 
programmers are involved in writing programs for large systems, the 
systems analyst would provide the program specifications and the 
programmer would begin work at this stage. We will now concentrate on 
program design rather than systems design. 


Complex problems cannot be solved in one step. Therefore, they are 
divided into a set of sub-problems which individually are more easily 
solved. This decomposition process results in a set of programs and 
modules interacting with each other. A program test plan needs to be 
developed for the program or module which will be used in the building of 
the whole program, to ensure that each meets its individual 
specifications. 


All these programs and modules are defined in terms of their inputs, 
outputs, required functions and processes. The interaction parameters 
(timing and performance requirements) between each of the modules in 
the program are defined explicitly. 


At this stage, the use of formal program design techniques and 
programming standards are recommended. The designer should select 
proper data structures and algorithms for the implementation of the input 
and output data and the system functions, although the final details may 
be left until the coding stage. 


During the development of the design, a decision will have been made as 
to which programming language would be most appropriate. As this is the 
last stage before the coding of the new system, a decision should be 
taken as to whether the programs are going to be developed internally, 
externally or both. The program can either be developed from scratch, 
assembled from application generators (see Chapter 7), or contracted to 
an independent programmer who will produce the required code from the 
program specifications. 
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Coding 


The object of this stage is to produce the programs that will make up the 
system. Ideally, the coding should start when the previous phase (design) 
is completed. This phase is complete when all code is written and 
documented, and compiles without any errors. The program is then ready 
to be tested. 


Testing 


Every program module needs to be tested according to the test plan 
developed in the design phase and each should match its individual 
specification. This task can be carried out at the same time as the 
program is being developed. 


When the program is completed and all separate modules have been 
tested, a full test of the program, according to the test plan, will be 
performed. Any errors in the program will be corrected and the test 
repeated. However, the testing could result in the identification of 
changes to the design of the program, in which case the analysis, design, 
coding and testing phases are repeated for the changes required. 


If the program is part of a system, a systems test will take place after the 
individual programs have been tested, where the programs are tested as 
a group to see how they interact with each other. 


Do you remember the example of many programs accessing and 
updating the same data file in the production control system? The system 
must be tested in each environment in which it is likely to be used, as the 
programs may have been developed on machines using the latest 
technology. The user, however, could be working with older machines 
which were specified in the requirements analysis phase. For example, 
websites must be tested on a range of browsers and different versions, 
rather than just the latest. 


The test of the finished application, completed internally by members of 
the development team, is called an alpha test. 


Finally, the software is tested externally, either outside the team or 
outside the production company. This is called a beta test. If users are 
involved it is called user evaluation. Initially, this will be in a controlled 
environment to ensure that it meets the user’s requirements and then ina 
live environment by some friendly users, to identify any hidden problems. 


This is not always possible, as in some applications there can be no 
errors. For example, a nuclear reactor control, a flight control support in 
an aircraft or a patient monitoring system in intensive care units in 
hospitals, must all be 100% error-free before they are tested in a live 
environment. The only way to test this software is by simulating the live 
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environment. For example, a simulation of a nuclear reactor will be 
developed to test the system which will control it. 


Implementation and Support 


When all the previous stages have been completed to the satisfaction of 
everyone involved, the system is ready for implementation. User 
documentation or operating instructions may be required. After the 
installation of the system, it must be kept operational and maintained 
according to the needs of the users. If those needs change and the 
amendments required are substantial, then the software lifecycle 
described above would be applied to the amendments. 


Definition: lifecycle 


The complete lifetime of a software system from initial conception through 
to final obsolescence. The term is most commonly used in contexts 
where programs are expected to have a fairly long useful life, rather than 
in situations such as experimental programming, where programs tend to 
be run a few times and then discarded. Traditionally, a lifecycle is 
modelled as a number of successive phases, typically: 


user requirements; 

system requirements; 
software requirements; 

overall design; 

detailed design; 

component production; 
component testing; 

integration and system testing; 


acceptance testing and release; 


Operation and maintenance. 


Such a breakdown tends to obscure several important aspects of 
software production, notably the inevitable need for iteration around the 
various lifecycle activities in order to correct errors, modify decisions 
which prove to have been misguided, or reflect changes in the overall 
requirements for the system. 


It is also Somewhat confusing to treat operation and maintenance as just 
another lifecycle phase, as during this period it may be necessary to 
repeat any or all of the activities required for initial development of the 
system. 


There has therefore been a gradual movement towards more 
sophisticated models of software lifecycle. These models provide explicit 
recognition of iteration, and often treat the activities of the operation and 
maintenance period simply as iteration occurring after, rather than before, 
release of the system for operational use. 


See also spiral model, V-model, waterfall model. 
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Exercise 3.1 [30 minutes] 


a) Name four stages of the program development process. 


b) Which stages in the program development process involve users? 
In what way are they involved? 


c) What are the main contents of a program specification? 


4 Structure Diagrams 


4.1 Introduction 


In this section you will be introduced to some of the techniques used by 
programmers in the design phase of the software development process. 
So far, the importance of defining the appropriate data type has been 
emphasised. It is now time to look at the methods that are used to 
organise data for computer processing and the techniques that can be 
applied both for clarifying the structure and manipulating the data. 


4.2 Data Structure Diagrams 


The first technique is the use of data structure diagrams as a tool to 
analyse how the data is organised. This is a technique that can be 
applied to the analysis of both data and problems. It is initially applied to 
data structures, but will also be used later, in the section on Analysing the 
Problem. 


A data structure is a way of describing the relationship of the component 
parts of the structure to the whole. If we were to collect information about 
a group of students studying a module in a college, we would have a 
name for the module and the components would be the students. Each 
student has data recorded such as name, address, assessment results 
and so on. 


Definition: data structure (information structure) 


A data structure is a specialised format for organising and storing data. 
General data structure types include the array, the file, the record, the 
table, the tree, and so on. Any data structure is designed to organise data 
to suit a specific purpose so that it can be accessed and worked with in 
appropriate ways. In computer programming, a data structure may be 
selected or designed to store data for the purpose of working on it with 


various algorithms 


Definition: structure 


The structure of a program is the structure of the system, which 
comprises software elements, the externally visible properties of those 
elements, and the relationships between them. 
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The technique is as follows: 


e A rectangle containing a * placed in the top right hand corner means 
an iteration (or repetition) of all subsequent lower level boxes. 


e A box containing a small circle placed in the top right hand corner 
represents a choice between alternative sets of data, in other words 
a selection. 


Iteration Selection 
(repetition) 


Figure 3.1 Representation of iteration and selection in a structure diagram 


Example 1 
Draw a data structure for the students taking a particular module at a 


college. Each student has data recorded such as student ID, name, and 
results, for each assessment in the module. 


*x 


There are many students 
studying this module. 


For each student, the 
data stored will be 
student ID and name 
followed by results, in 
that order. This is 
shown by the sequence 
of the boxes reading left 
to right along the row of 
boxes. 


Result * 
Details 


Student ID 
Second 
Name 


The name is made up of 
second name and first 
name and the students 
will have a number of 
results, one for each 
assessment in the 
module. 


Assignment Exam 


Figure 3.2 Data structure diagram for Example 1 
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Example 2 


Draw a data structure diagram to show a pack of cards organised into 
four suits. 


The cards are either 
red or black, so this is 
shown as a selection. 


The red cards can be 
either diamonds or 
hearts, again this is 
shown as a selection. 


0 
Diamond 
Suit 


The diamond suit 
consists of 13 cards, so 
this is shown as an 


Diamond iteration. 


Card 


Figure 3. 3 Data structure diagram for example 2 

Example 3 

A chess set can be illustrated by a data structure diagram. First, there is 
a choice between a white and black set of figures. Each set consists of a 
repetition of pawns, castles, knights and bishops and two other single 


pieces. However, because pieces can come in any order, they are initially 
defined as selections. 


Chess Set 


Oo Oo 
White Set Black Set 


0 O 0 O O O 
Pawns Bishops Castles Knights Queen King 
* * * * 
Pawn Bishop Castle Knight 


Figure 3. 4 Data structure diagram for the chess set 
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Exercise 3.2 [30 minutes] 


A sports club maintains records of its members showing name, 
category of membership and payment details. Members are either life 


members or standard members. Life members do not pay membership 
fees, so no payment details are needed. 


Draw a data structure diagram to represent this data. 


Exercise 3.3 [30 minutes] 


A train consists of an engine followed by a number of 1st class 
coaches, then a number of 2nd class coaches with a guards van at the 
end. Draw a data structure diagram to represent this structure. 


Organising Information 


The need to organise information has always existed and the 
maintenance of files of related data preceded the invention of computers. 
Most people will be familiar with the idea of organising data in this way. 


The data structure of a file to be used for computer processing is defined 
using the terms file, record and field. 
e A file is composed of records. 


e Each record contains data organised in a defined structure. 
e The components of this structure are called fields. 


Definition: file 

Information held on backing store (i.e. usually on magnetic disk or 

magnetic tape) in order to: 

(a) enable it to persist beyond the time required for execution of a single 
job 

and/or 

(b) overcome space limitations in the main memory. Files may hold data, 


programs, documents, pictures, or any other information. They are 
referred to by file name. 


Files with a very brief existence (i.e. in case (b) above), or where they 
simply carry information between one job and the next in sequence, are 
called work files. See also master file, data file. 
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Definition: record 


Records consist of a series of related fields which contain data items 


concerning an entity (e.g. a payroll record would contain all the data items 
concerning an employee’s pay details). A file consists of a number of 
records. 


[Definition: field | 
A field is an item of data within a record. It is made up of a number of 
characters, e.g. aname, a date or an amount. 


Consider the results file referenced earlier, in which details of the 
student’s ID, name and assessment results for the four core modules and 
two optional modules are stored. 


This results file will consist of many records, one for each student, and 
the fields are the details of each student. The fields such as student_id, 
second_name, first_name, operate like variables, in that they have a 
name and contain data. 


Note that the data stored in the modules part of the record would be the 
module number, 1st assessment mark and 2nd assessment mark for 
each of the four core modules and two optional modules. The description 
of this organisation could be as follows, but note that only two modules 
are illustrated in the table. 


Results File 
Record Details 
Student Name Modules 
IP Second | First 1st core module 1st optional module 
name name number and results number and results 
12345 | Powell | Sandra; Cl 34 46 O02 45 0 
23456 | James | Jenny | Cl 36 22 O3 46 50 


Figure 3.5 Table of data for the results file 
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* 
Student 


Student ID 


Module * 
results 


Second 
Name 


Result * 
Details 


Module 
Number 


Assessment 
mark 


Figure 3.6 Data structure diagram for the results file 


As the results file is a collection of records, one for each student, it is 
shown as an iteration. The top level box is drawn to represent the file — 
results. The next level is the box showing an iteration of student records, 
each of which is shown as a sequence of fields on the level below. 


The modules are also shown as an iteration, as there are a total of six 
modules and for each module there are fields associated with the module 
number and the results. The results are shown as an iteration because 
there will be more than one result associated with each module as the 
students will receive a mark for each assignment or exam. Compare the 
data structure diagram in Figure 3.6 with the table of data in Figure 3.5, 
as they both represent the same record structure. 


Designing Record Structures 


The fields in a record must be carefully designed to contain all the data 
that may need to be referenced for producing information from the file. 
This task would be performed in conjunction with the client and the result 
would form part of the program specification required by a programmer, 
before he/she can start to build the program. 
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There is more than one correct answer to the following questions. 


Exercise 3.4 [40 minutes] 


The treasurer of a local sports and social club wishes to record all 
financial transactions on the computer. The club has four different 


categories of membership; junior, senior, life membership and monthly 
membership. As a consequence, different subscription rates apply. 


Juniors and life members do not pay subscriptions. Subscriptions are 
payable each January for senior members, whereas monthly members 
pay monthly, 12 times per year. The treasurer wants a record of when 
members have paid so that, at the end of the financial year (December) 
when he does his accounts, he can identify those members who have 
missed paying a subscription. 


You have been asked for advice concerning the design of the file 
structure. 


a) Complete a table showing field names and sample data. 


b) Draw a structure diagram for the file. 


Exercise 3.5 [20 minutes] 


Design a file structure for patients at a medical centre. Draw a structure 
diagram for this file. 


5.3 File Structures on a Computer 


The programmer does not need to worry about the physical structure of a 
file as the operating system will normally control its creation and 
maintenance. All the programmer has to do, is include an instruction to 
‘open the file’ before any instructions to read the file and to ‘close the file’ 
when the end of the file is reached. Instructions for testing for the 
end_of_file (EOF) condition are provided in most programming 
languages. 


Data Records (called File body in diagrams) 


record 1 RTA ORE TO Tee record n 


Figure 3.7 Table showing the general structure of a file 


Compare the table shown in Figure 3.7 with the diagram shown in Figure 
3.8. They both represent the file structure. 
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As a file is a collection of 
records, it can be shown as 
containing an iteration. The top 
level box is drawn to represent 
the file. On the next level is an 
iteration of body records, 
which could themselves be 


shown on another level as a 
Fields * number of fields. 


Figure 3.8 Diagram showing the general structure of a file 


The structure of a record can contain layers of complexity. There may be 
fields, for example, that are subdivided into more fields, giving more 
detailed information or allowing areas for alternative information. 


* 
Record 
; BlockOf! > || 22.6. suceciereee doused. : ; 


Each record is made up of a series of fields. 

* The representation shown denotes any number 
of fields. The second field has been replaced 

with a ‘block of fields’ which shows an iteration 

of two fields (field a and field b). An example 
of this might be field 1 containing a student 
number and the ‘block of fields’ being exam 
results. This would be an iteration of the 
results for a number of modules, where field a is 
a module number and field b is the examination 
result. 


Figure 3.9 Diagram showing further breakdown of the general structure of a file 
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Exercise 3.6 [30 minutes] 


Produce the data structure diagram for the following file: 


Stock File: 


e There are two types of records within this file, the main record and 
trailer records. 


e Each main record is followed by as many trailer records as there 
are outstanding orders for that part. 


e Note that these trailer records are part of the data in the record. 


Main record: 

e Part number 

e Description 

e Quantity in stores 

e Minimum re-order quantity 
e Delivery lead times (weeks) 
e Current price 


Trailer records: 

e Order number 

e Order quantity 

e Order week number 


Fixed and Variable Length Records 


Fixed Length Records 


A fixed length record is one in which the number of characters allowed in 
each record can be predetermined. This occurs when the data can be 
well defined and will remain in the same format for each record. In 
addition, each field can be given a sensible maximum size to cope with 
variations in the size of the content. 


Example 


Part of a staff record may be designed as follows: 


Field Name Size (in characters) 
Employee Number 5 

Surname 20 

Initials 5 

First name 15 


Figure 3.10 Part of a staff record showing field lengths 
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Each of the field sizes can be reasonably determined — a sensible 
maximum can be decided for the surname and the initials, and fixed sizes 
for the other fields. 


In records such as this, it is easy to determine the position of each record 
in a file and consequently the algorithms needed to access the record are 
relatively straightforward. It is also possible to determine the total data 
volume, hence ensuring appropriate backing storage is available. 


6.2 Variable Length Records 


There are occasions when the records in a file are required to store 
variable amounts of data. It is not sensible to allow sufficient spare 
capacity in the field lengths to cope with this variation, as this would be a 
highly inefficient use of the backing store. 


In the staff file example, this could occur if we had to store details of 
previous work experience, as considerable variation will occur between 
employees. Using an estimate, based on the maximum storage space 
needed for such a record, could mean considerable wastage. In such a 
case, the more complex processing requirements needed to deal with 
variable length records may be justified. It is the responsibility of the 
design team to ensure that the processing/storage trade-off is sensibly 
chosen. 


7 Analysing the Problem 


It is very important to insist on a proper program specification. If you are 
unclear about what needs to be achieved, you can get into a cycle of 
confusion that can be expensive and frustrating for both the client and the 
producer. 


Having obtained a satisfactory program specification, the next step is to 
analyse the problem. Complicated programs can be solved more easily if 
they are broken down into simpler tasks and problems in a step-by-step 
process. 


At each step the problem is broken down further, delaying the 
consideration of detail as long as possible. This approach is known as 
top-down development by stepwise refinement. 


Essentially the problem is studied and the major components established 
in outline only. (It is important not to be distracted by detail at this stage.) 


Having established the major components, the technique is repeated for 


each of these major components until the stage is reached where the 
solution becomes definable in terms of program instructions. 
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This is easily 

illustrated by means 
Main Problem of a block diagram. 

Note the approach 
Step 1 consists of a sequence 


of clearly defined 

steps, each one of 

Step 2 which provides a more 

Eurther precise analysis of the 
problem than the previous one. 


Sub-problems 


Refinements 


Figure 3.11 Hierarchical block diagram for top-down stepwise development 


Definition: top-down development 


An approach to program development in which progress is made by 
defining required elements in terms of more basic elements, beginning 
with the required program and ending when the implementation language 
is reached. At every stage during top-down development, each of the 
undefined elements from the previous stage is defined. 


In order to do this, an appropriate collection of more basic elements is 
introduced, and the undefined elements are defined in terms of these 
more basic elements, (“more basic” meaning that the element is closer to 
the level that can be directly expressed in the implementation language). 


These more basic elements will in turn be defined at the next stage in 
terms of still more basic elements, and so on, until at some stage the 
elements can be defined directly in the implementation language. 


In practice, “pure” top-down development is not possible. The choice of 
more basic elements at each stage must always be guided by an 
awareness of the facilities of the implementation language, and even then 
it will often be discovered at a later stage that some earlier choice was 
inappropriate, leading to the need for iteration. 


Compare bottom-up development. 
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Definition: stepwise refinement 


An approach to software development in which an initial, highly abstract 
representation of a required program is gradually refined through a 
sequence of intermediate representations, to yield a final program in the 
chosen programming language. 


The initial representation employs notations and abstractions which are 
appropriate to the problem being addressed. Subsequent development 
then proceeds in a sequence of small steps. Each step refines an aspect 
of the representation produced by the previous step, thus yielding the 
next representation of the sequence. 


Typically, a single step involves simultaneous refinement of both data 
structures and operations, and is small enough to be performed with 
some confidence that the result is correct. Refinement proceeds until the 
final representation in the sequence is expressed entirely in the chosen 
programming language. 


This approach is normally associated with N. Wirth, designer of the 
Pascal and Modula languages. Compare structured programming. 


Definition: program decomposition 


The breaking down of a complete program into a set of component parts, 
normally called modules. The decomposition is guided by a set of design 


principles or criteria that the identified modules should reflect. As the 
decomposition determines the coarse structure of the program, the 
activity is also referred to as high level or architectural design. See also 
modular programming, program design. 


Definition: modular programming 


A style of programming in which the complete program is decomposed 
into a set of components, termed modules, each of which is of a 
manageable size, has a well defined purpose, and has a well-defined 
interface for use by other modules. Since the only alternative — that of 
completely monolithic programs — is untenable, the point is not whether 
programs should be modular, but rather what criteria should be employed 
for their decomposition into modules. 


This was raised by David Parnas, who proposed that one major criterion 
should be that of information hiding. Prior to this, decomposition had 
typically been performed on an ad-hoc basis, or sometimes on the basis 
of ‘stages’ of the overall processing to be carried out by the program, and 
only minor benefits had been gained. 


More recently, there has been great emphasis on decomposition based 
on the use of abstract data types and on the use of objects or object 
orientation; such a decomposition can remain consistent with the 
principles of information hiding. 
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8 Designing a Top-down Modular Program 


Structured programming techniques are addressed in this section. There 
is a different approach to the design of object-oriented programs, which 
will be addressed later in the workbook. 


Every structured program has one module which contains the instructions 
that supply the top-down logic for the whole program. In the C language, 
it is the function called ‘main’. This contrasts with Pascal, which has its 
main procedure at the end of the program. In COBOL, it is the first 
paragraph in the procedure division. 


This module normally controls the three basic functions of the program, 
namely: 


e _initialising; 
e processing; 
e closing down the program. 


In each of these stages you can expect to find similar sets of tasks, 
whichever program is being written. This, of course, is a great help when 
you become familiar with these routines. 


Examples of Initialisation Routines 


Some examples of the tasks which may be done at the beginning of a 
program are: 


e requesting library routines; 

e §=opening files; 

e = defining headings, creating opening screens; 
e _initialising variables. 


Examples of Processing Tasks 


These could be further sub-divided into processes concerned with input, 
computation and logic or output. 


Input: 


e reading records from a file; 
e requesting data from the user. 


Computation and logic: 


e comparing values; 
e computations; 
e assignments. 
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Output: 


e updating files; 

e writing records to a file; 

e printing a report; 

e = displaying information on screen. 


Examples of Closing Tasks 


At the end of a program, before exiting, the following tasks may need to 
be completed: 


e = producing printed output; 

e §=writing a file; 

e printing final totals in a report; 

e = displaying final user messages; 
e closing all open files. 


We have seen that hierarchical diagrams can be used to illustrate a 
method of problem solving. In order for such diagrams to be useful for 
program design, there must be a method of showing sequence, selection 
and iteration in the same way as has been shown for data structures. 
There must be a way of indicating repetitive tasks such as: 


e reading records from a file; 
e processing the elements of an array; 
e repeating procedures from a lower level in the diagram. 


There must also be a way to show the terminating condition for these 
tasks. 


The method described here is used in the technique known as Jackson 
Structured Programming (JSP) Method. 


The convention is to 
Process write the terminating 
condition (or a 
reference code fora 
conditions list) above 


Hanlinistied the box indicating the 
iteration, and to 
a insert a* in the top 
Tasks right hand corner of 


the box. 


Figure 3.12 Jackson structure diagram showing iteration 
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Example 1 


Draw a section of a structure diagram to show a record being read until 
the end of the file is reached. 


Section 


Until EOF 


record 


Figure 3.13 Reading records until end of file 


Example 2 


Draw a section to show a process being repeated until the user types 


stop. 


| until user types Stop 


* 
process 


Figure 3.14 Continue processing until user types stop 


You must also be able to make a selection of tasks such as: 


e choosing an alternative procedure at a lower level; 
e taking actions according to the results of user entries; 
e making comparisons and acting on the result. 


In this case, the technique is to write the conditions above or below each 
box to be chosen as the result of the selection and to insert an ‘o’ in the 
top right hand corner of the box. 
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Example 3 


Input a number; if the number is positive, then calculate 17% of the 
number and display the result, otherwise display an error message. 


Input 
number 
number >= 0 
O 
process 
calculation 


Figure 3.15 Jackson structure diagram showing convention for selection 


number < 0 


display 
error 
message 


display 
answer 


On some occasions no action will be taken as one of the results of a 
logical test. In this case the resultant box will either be left blank or a line 
will be drawn inside it. 


Exercise 3.7 [40 minutes] 


a) Draw a section of a structure diagram for an element called draw, 
which allows the user to draw a circle upon inputting a C, a box upon 
inputting a B or to quit upon inputting a Q. 


b) Draw a section to show a record being read and written to a new file 
if it contains the field NCC. 


Exercise 3.8 [40 minutes] 


a) Draw a section of a structure diagram to read a record, and if sales > 
100 then do process A, otherwise do process B. 


b) Draw a section of a structure diagram to read a record, and if sales 
>= sales target, then do process A, otherwise do nothing. 
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Determining the Structure of a Program from 
a Given Specification 


In this section, the steps involved in designing a program from a program 
specification will be outlined. A straightforward example is given to 
illustrate the design process. Some examples, showing the complexities 
that can be introduced with different logical data structures, will be 
introduced later. 


Program Specification 


The program is to read a file of records consisting of a student's name 
and a set of six examination marks, each one an integer percentage 
score, although some students may have zeros entered. The average 
mark for each student is to be calculated and then assessed for a grade. 
The grades are: 


e = Fail, which is an average mark between 0 and 39; 
e Pass, which is > 39 and <= 59; 

e Merit, which is > 59 and <=79; 

e Distinction, which is > 79. 


The results are to be written to a new file and each record should contain 
name, average percentage mark and grade. On termination of the 
program, the number of students for whom examination marks have been 
processed, is to be printed. 


At the top level, the program can be separated into the components’ 
initialisation tasks, processing tasks and closing tasks: 


Figure 3.16 Top level of top-down development — stepwise refinement 


Initialising 


Step 1 

At the top level the outline requirements for each main component are: 
e —Initialising: open files. 

e Processing: process each record to find grades. 

e Closing: close files. 
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Program 


Initialising Processing Closing 


Open all Process each Close all 
files record to files 
calculate grades 


Figure 3.17 Step 1 of top-down development — stepwise refinement 


Step 2 


Taking each component in turn, they can be further divided to give each 
of the following refinements: 


e = Initialising refinement: open files for reading and writing. 


e Processing refinement: set up loop to read in and process each 
record. 


e Closing refinement: close files, output closing message. 


Program 
Processing 


until EOF 


Initialising 


Processa * Output 


Close all : 
record to files closing 
calculate grade message 


Open all 
files 


Open input 
file 


Open output 
file 


Figure 3.18 Step 2 of top-down development — stepwise refinement 
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Step 3 


No further refinements are needed for the initialisation or termination 
routines. However, further analysis of the processing tasks is needed. 


Processing the records — processes are needed to: 


e calculate the mean; 

e establish the grade; 

e write the results to a file; 

e count the number of records. 


Program 
Processing 


until EOF 


Open all Processa * Gnesi Output 
files record to files closing 
calculate grade message 


Open input 


} Open output 
file 


file 


accumulate 
number of 
results 


calculate establish write results 
average grade to file 


Figure 3.19 Step 3 of top-down development — stepwise refinement 


Step 4 


Although a small refinement is needed to calculate the average (mean), 
the problem has now been simplified to such an extent that the actions 
required to complete the diagram are easily determined. 


Calculate average: 


e Accumulate marks — add each mark to total variable. 
e Divide total by number of marks (6). 
e Assign result to average variable. 
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Assess grade: 


e Compare average with grade boundaries. 
e Write result to grade variable. 


Write to file: 


e Write name variable + average mark + grade variable to record fields 
in new file. 


Accumulate number of records: 


e Add 1 to number_of_records variable. 


e Make sure this running total is at zero at the beginning of the 
program. 


Exercise 3.9 [60 minutes] 


Continue with the stepwise refinement for calculate average, assess 


grade and accumulate running total, by using the information provided 
above in step 4. Complete the diagram in Figure 3.19. 


10 Structured Programming Diagrams 


10.1 Introduction 


In Chapter 2 of this workbook the three control constructs required for the 
building of structured programs (Sequence, iteration, and selection) were 
introduced. 


Earlier in this chapter, data structures and top-down design have been 
studied. It is now time to combine and formalise these methods and the 
technique that we have chosen is Jackson Structured Programming 
(JSP). 


This technique: 


e uses top-down stepwise refinement; 
e only uses the three control constructs; 


e bases the program design on the structure of the data to be 
processed. 
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The steps are: 


1. Starting from the program specification, produce data structure 
diagrams of the data to be input for processing and the data to be 
output. 


2. Produce a program structure which reflects the requirements of the 
data structures, ensuring program and data structures correspond. 


3. Analyse this in a top-down manner to produce an increasingly more 
specific program structure, refining each layer of the design until all the 
necessary actions and conditions have been identified at their 
appropriate levels. 


The earlier work on data structures produced physical data structure 
diagrams. A programmer is interested in matching the data structures 
with the program specification. This may mean that some amendment to 
the physical structure is required because all components of the physical 
data structure diagram may not require processing, and greater emphasis 
may be placed on some components and less on others. The data 
structure diagram to match the program design is called a logical data 
structure diagram — it reflects how the data is logically structured for the 
purpose of the program. 


Example 1 


A stock record file is organised into sections by content; in this case, 
children’s clothes, men’s clothes and women’s clothes. The data on 
children’s clothes is to be extracted and updated. 


The physical Data Structure Diagram (DSD) is as shown in Figure 3.20 
and the logical DSD in Figure 3.21 


Stock File 


Men's 
clothes 


Children’s 
clothes 


Women’s 
clothes 


Stock * 
Records 


Stock * 
Records 


Stock * 
Records 


Figure 3.20 Physical Data Structure Diagram 


The logical data structure reflects the fact that no processing is required 
of the records in the sections men’s clothing and women’s clothing. 
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Stock File 
Other O 
sections 


not children’s 


Children’s 
clothes 


children’s 


Stock * 
Records 


Figure 3. 21 Logical Data Structure Diagram 


Example 2 


A transaction file has been created by inputting the information on sales 
and purchases in batches. Each batch has a header record followed by 
records with the details of the sales or purchases. The batches of records 
are input as they are received and therefore can occur in any order and 
any number. 


The physical data structure diagram, i.e. a diagram reflecting the actual 
structure of the data, is shown in Figure 3.22. 


Transaction 
File 


File body 


Purchases 


i 


Sales Purchases Purchases 
Header Header Body 


* 


Sale Purchases 
Records Records 


Figure 3.22 Physical data structure diagram of transaction file 
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If a program is required to process every record and produce a report 
which provides totals for sales and purchases, then the logical data 
structure diagram would be very similar. 


However, if a sales report is required (i.e. no reference to purchases) 
which shows totals for both cash and credit sales, then the logical data 
structure diagram would need to be changed. Every record will no longer 
be processed and the purchase records will be ignored. 


The processing for sales records will be different depending on whether 
they are cash sales or credit sales, as two different totals must be 
accumulated. 


The logical data structure diagram is now as shown in Figure 3.23. 


Transaction 
File 


File body 


Sales * Credit * 
Records Records 


Figure 3.23 Logical data structure diagram of transaction file 


3-32 © NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 3 — Data Analysis and Problems 


Exercise 3.10 [40 minutes] 


A file contains records of students sorted into undergraduates followed 
by postgraduates. Undergraduate students pay fees at rate A and 
postgraduates at rate B. Students from other countries pay fees at rate 
E for undergraduates and rate G for postgraduates. 


a) Draw a physical data structure diagram for the file. 


b) Draw a logical DSD when it is required to produce a list of all the 
students who have paid their fees. 


c) Draw a logical DSD for a program listing foreign postgraduate 
students. 


d) Draw a DSD for a program listing undergraduates who have not paid 


their fees. 


10.2 Establishing the Program Structure 


The program structure is established by determining the input and output 
data structures and combining them to produce the program structure. 


As you have seen in the last lesson, the output requirements are a great 
help in determining the logical data structure of the input file. 


The input and output data structures are compared and, starting from the 
top down, components which correspond to each other are identified. To 
correspond, the two components must be in the same relative place as 
each other, and the input component processed to produce the output 
component. 


It is then possible to combine the data structures to produce a program 
structure. 


This process is illustrated in the following two examples, which have been 
chosen as they cover two common types of file processing requirements. 


Example 1: Using One Input File to Produce One Output 
File or Report 


In this example, the input file is organised sequentially. It contains details 
and prices of spare parts which have been ordered by garages. The 
garages have been grouped together by area and the parts orders 
grouped together for each garage. 
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Garage Spares 


Order File 


File 
body 


* 


Figure 3.24 Physical data structure diagram of the parts orders file 


The requirement is to access this file and produce a total of the value of 
orders by garage. 


Garage Spares 
Order File 


Figure 3.25 Logical data structure diagram of parts orders file 


The report structure is to have a heading, a one-line summary of the total 
order value for each garage, and a footer containing the total of all orders. 
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Garage Spares 
Order Report 


report 
body 


lines of * 
report 
garage total 
details ordered 


individual* 
order 
amounts 


Figure 3.26 Physical data structure diagram of the report 


Garage Spares 
Order File 


Garage Spares 
Order Report 


report 
body 


lines of * 
report 
garage total 
details ordered 


individual® 
order 
amounts 


Figure 3.27 Comparison of input and output physical data structure diagrams 
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As the output requirement does not need to include the area of each 
garage, the logical DSD of the input file does not need to include this 
component, thus the amended logical DSD of the input and output files 
are now as shown in Figure 3.28. 


Garage 
Spares 
Order File 


Garage 
Spares 
Order Report 


lines of * 
report 


individual * 
order 
amounts 


Figure 3.28 Comparison of logical DSD of input and output files 


Note the correspondence between the garage and line of report. 
Correspondence can only occur if the input component is to be processed 
to become the output component. One line of the report is required for 
each garage. There are a number of parts orders for each garage on the 
input file, and these connect to the individual amounts which must be 
accumulated for the total ordered on each line of the report. 


The process required in the program can now be seen to be connected to 
the input and output structures, but there are a few changes to be made 
so that the three structure diagrams are of the same _ structure. 
Remember that the physical structures of the input and output files 
cannot be changed, but the logical structure diagrams can be changed. 
Dummy components can be added so that the logical data structures for 
the input and output are identical. 


Note that header and footer boxes have been added to the logical DSD of 
the input file. This relates to the start and end of the file — outside the 
data part of the file. Remember this is a logical representation, not a 
physical one. 
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The logical data structure diagram for the program is thus: 


Garage Spares 
Order Report 


Program 


processing 


process * 
records fora 
garage 


print 
headings 


print 
footer 


files files 


garage 
details 
ordered 


add individual 
order 
amounts to 
total 


Figure 3.29 The program structure 


It can be seen that the program structure has the following properties: 


e each data component is related to only one program component; 


e each program component is related to only one input and/or one 
output component. 


The preliminary program structure is now ready for more detailed analysis 
which will then form the basis of the program construction. Further details 
concerning this are available in the self study section. 


11 CRC Cards 


11.1 Introduction 
CRC stands for 


e = 6Class. 
e Responsibility. 
e Collaboration. 
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CRC cards were created in the late 1980s as a method to teach the 
object-oriented paradigm. Programmers around this time were still used 
to structured methods, and many found it difficult to move to this new way 
of programming. CRC cards aimed to find a way to express a system in 
an abstract way that was analogous to structured programming. 


CRC cards are a brainstorming tool used when first determining which 
classes are needed, and how they will interact. This technique can be 
used for the different phases of the development process from analysis to 
design. A core group of people attend each session. 


The technique was originally employed using specially designed 
software, though this changed to using 6” X 4” cards. The cards contain: 
The class name. 

Its super and subclasses (if applicable). 

The responsibilities of the class. 


Pe ON 


The names of other classes that the class will collaborate with to 
fulfil its responsibilities. 


The cards proved to be advantageous, as they are portable, cheap, 
create better group dynamics and give participants a greater feel for the 
system. 


Objects, as explained in the introduction to object-oriented programming, 
are instances of classes. Objects can be real, such as computer 
hardware, people, and time or they can be virtual objects created for the 
system. Classes have well-defined roles in the system. 


Responsibilities replace the explicit attributes and operations of a class, 
and concentrate on the behaviour of the class. Responsibilities 
concentrate on the interactions made by the class. 


Classes sometimes need to interact with other classes to complete their 
responsibilities. These are known as collaborations. A class collaborates, 
or works with another class, to fulfil its purpose. 


There are many advantages in using Class, Responsibility and 
Collaboration-based modelling: 


e lt creates a good understanding of the domain to be modelled. 
e = It identifies high-level responsibilities. 

e It ensures better system design. 

e It ensures that knowledge of the domain is shared by everyone. 
e Agood working model can be developed. 

e = It is a good way to identify areas that may be missed otherwise. 
e = It is fun! 
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11.2 


The main disadvantage is that the lack of structure can be disconcerting 
to structured programmers not used to the object-oriented paradigm. 


CRC modelling uses no specific symbols or syntax. The only items 
needed are 6” x 4” cards. The cards are simply marked up as in Figure 
3.30. Collaborators can be recorded more than once if they carry out 
more than one responsibility for a class. 


Superclass 


Responsibilities Collaborators 


Figure 3. 30 A Sample CRC card 


Finding Classes and Considering Scenarios 


Before a CRC card session begins, the participants need to know what 
the requirements of the system will be. CRC cards sessions begin with 
brainstorming, where the major classes for the system are identified. 
These sessions aim to produce as many ideas as possible. 


The aim is not to analyse the suggested classes, collaborators and 
responsibilities, but to generate ideas. Names should be short and aim to 
encompass the role that the object plays in the system. Generally, the 
better the naming, the better the understanding people have of the 
requirements of the system. 


Once a list of classes has been developed, the individual classes can be 
discussed and refined until a set is agreed upon. 


After this, the group considers scenarios that will occur in order to test the 
classes so far, and to see if there are any that are missing. Some names 
may be discarded as they overlap others, and some classes may be 
subclasses of others. A subclass is a class within a class, sharing some 
of the attributes and behaviours of its parent, but having attributes and 
behaviours of its own. 


Considering the scenarios helps to achieve an understanding of how 
each class interacts with other areas of the system. The scenarios tested 
should help to define the responsibilities each class has — what a class 
does rather than how it does it. When all the classes have been found, 
the group considers scenarios that are likely to occur. Possible scenarios 
include ‘what happens when this occurs?’. These are best kept as simple 
as possible. 
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For example, for a library system, a scenario might be: 


‘What happens when Student A wants to take out a book, but has items 
that are overdue?’ 


When ‘finding’ classes, the following tips should be considered: 


e Consider physical objects such as printers, filing cabinets etc. 
e Include physical objects such as accounts, files and windows. 
e Think about the interface of the system. 

e Names should be singular, not plural. 

e Use nouns and noun phrases. 

e Ignore terms that are not meaningful. 

e Use active phrases e.g. ‘get new mail’. 


e Convert pronouns to the names that they stand for e.g. ‘librarian’ 
rather than ‘I, we or you’. 


e Think about adjectives carefully, as they may be irrelevant or they 
could indicate subclasses or other behaviour. 


Method 


A CRC card session should involve about five or six people. It is useful to 
include domain experts (people who have a detailed understanding of the 
system to be developed). These may be the people who wrote the 
system requirements, or someone who has an in-depth understanding of 
the domain the system is being developed for. 


Begin with a brainstorming session. Usually someone oversees the 
session, and records suggestions on a flipchart or board at the front. 
Write down each idea as it is suggested. It is important to bear in mind 
the system requirements for this session, so that the suggestions are 
relevant to the desired product. During brainstorming sessions anything 
goes — suggestions are not analysed or discussed and they should be 
used to generate a deeper understanding of what is needed. Each 
suggestion should help develop more specific ideas. These suggestions 
form the basis of the classes for the system. 


Next, the classes suggested are discussed in detail. There may be some 
that are repeats or seem less well defined than the others. There will 
probably be a discussion on the semantics of the class names, that is, the 
deeper meaning of the classes. This is a valuable part of the session, as 
it helps to define the system much more thoroughly. At this stage, some 
of the classes will be filtered out. 


Each class is written on a card, and assigned to someone. A short 
description can be added on the back of the card, which can be read out 
to the group for approval. The responsibilities of that class are then added 
to the card in the section provided. For instance, if the class is for a 
librarian, the responsibilities for the librarian may be: check out books, 
check in books or add new books to the database. Some of the 
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responsibilities added may not be needed in the scenarios that follow, but 
also many of them may be identified before then. 


Next, the group considers simple scenarios which are likely to occur, 
such as ‘what happens if student B wants to take out a book that is 
already on loan to someone else?’. This should help the group to 
discover any missing classes, and the relationships that objects have to 
each other. Scenarios involve a discussion of how the classes behave 
and how they interact with each other. For instance, a librarian lending 
out a book would need to interact with both the catalogue and the person 
borrowing the book. 


Exercise 3.11 
Brainstorming. [60 minutes] 


Refining class and allocating cards. [30 minutes] 
Defining responsibilities and collaborations. [30 minutes] 
Discussing scenarios. [120 minutes] 


This exercise is designed to be a classroom activity. This exercise can 
be run as separate sessions if necessary. 


Requirements: You are going to design a library system for your 
college. Each borrower is allowed to borrow up to five items, as long as 
they have no overdue items and no fines outstanding. An item can bea 
book, journal, or video. Each item can be borrowed for three weeks. 


Run a CRC card session to identify the main classes that the system 
will need, and the responsibilities and collaborations each class has. 


When the classes have been finalised and the cards allocated, 
consider the following scenarios: 


What happens if Student A wants to borrow a book which is already on 
loan? 


What happens if Student B wants to borrow a book, has no fines to pay 
and has borrowed fewer than five items? 
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12 Summary 


In this chapter we have covered: 


e Program development process. 

e Structure diagrams. 

e Organising information. 

e Fixed and variable length records. 

e Analysing the problem. 

e Structured programming diagrams. 

e CRC cards introduction including scenarios, role play and method. 


The basic principles of structured program design are: 


e The problem is analysed using the technique of stepwise refinement. 


e The design includes only three constructs: sequence, selection and 
iteration. 


As you have seen from the diagrams, the program components are self- 
contained in that there is, in effect, one way in and one way out. This is 
an important characteristic of stepwise refinement and _ functional 
decomposition, namely that the functions: 


e perform well-defined operations on well-defined data. 


e have internal structures which are independent of the program or 
function that contains them. 


Programs constructed in this way are easier to maintain and document. 
CRC cards are a powerful way to learn about object-oriented thinking and 


help in the definition of system objects and their roles. They are a useful 
problem solving and analytical technique. 
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13 Self Study 


13.1 Program Development Process 


Note the definition of program specification (page 3-5) and, in 
particular, that the characteristics of a good specification are that it 
should be unambiguous, complete, verifiable, consistent, modifiable, 
traceable, and usable after development. 


Read further on this subject. Make notes concerning the meaning of 
each of the terms identified. 


Look at the definitions of different models of the software lifecycle, 
starting with the waterfall model, then the V-model and the spiral 
model. Compare and contrast after reading the definition of software 
lifecycle (page 3-9). Note particularly the comments concerning 
maintenance. 


13.2 Organising Information 


V1.1 


Self Study 1 


Look at the contents of the fields in the table shown in Figure 3.5. The 
results for only two modules are shown for each student, one core 
assessment and one optional assessment. 


a) What were Sandra Powell's examination results for the optional 
module C1? 


b) What is the student ID of the person who scored 0% in an 
examination? 


c) Who is studying the optional module 03? 


d) Assume you are studying the core modules C1, C2, C4 and C5 and 
optional modules O2 and O3. Your student ID number is 98765. 
Add your name. Extend the table to include fields for the examination 
results for all the modules. Add the marks for the examinations for 
core modules C1, C2 and C5 and optional module O3. Assume you 
have passed all examinations except the second examination in C2 
and the first examination in O03. You may choose your own mark for 
those examinations passed. Use ‘nil’ as the data entry for those 
examination marks not yet entered. 
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This example may seem rather old fashioned, in that it is concerned with 
batch processing and the programming involved with processing a master 
file and a transaction file. It has been included as a good example of 
problem solving. The case study is intended to provide further 
explanation and practice for you in designing algorithms to solve 
problems. 


Two Input Files and One Output File 
This is a more complex example which can be used to further 


demonstrate the complete process of JS. For simplicity, errors and 
deletions are not included in the example. 


Program Specification 


Definition: Program Specification 
A program specification is a document which states what a computer 


program is expected to do. It can be in the form of either a blueprint or 
user manual, from a developer point of view. 


A transaction file is used to update a master file and produce a new 
master file. The transaction file is sorted in ascending order on the same 
key as the master file. The update process is to be performed 
sequentially. For simplicity, there will be no more than one transaction 
record for each master record, e.g. as in a weekly payroll update where 
there would be one transaction record per employee containing details of 
the hours worked in that week. The records are matched by comparing 
the keys of the two files. This will result in three types of processing: 


e If the record keys on the master and the transaction file match, then 
the transaction record is used to update the master record. As there 
will only be one transaction record for this master record, then the 
updated master record would be written to the output file and a new 
master record and a new transaction record would be read. 


e If the transaction key is greater than the master key, then this means 
that there is no matching transaction record for the master record. 
Thus the master record is written to the new file unchanged, and the 
next master record is read. 


e If, however, the master key is greater than the transaction record 
then this means that there is not an existing master record for this 
transaction record. In some situations, e.g. updating the salaries 
master file with a week’s hours worked, this would be an error 
condition. However, for simplicity, in this program, the transaction 
record is used to create a new master record and the next 
transaction record is read in. 
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This process continues until both files have been processed. 


If the master file ends before the transaction file, new records are created 
until the transaction file ends. If the transaction file ends before the 
master file, the master file records are copied until the master file ends. 


To illustrate the problems, consider the following two files and their data. 


Master Transaction 
File File 

Master File Keys Transaction Keys 
001 003 

002 004 

003 005 

005 

007 


Figure 3.31 Master and transaction file 


The processing requirements are determined by the presence or absence 


of a key. 
Master Transaction 
File File 
Possible Possible 
Matching key Matching key 


Record © Record O Record © Record O 
Present Absent Present Absent 


001 001 
002 002 
003 003 
004 004 
005 005 
006 006 
007 007 


Figure 3.32 Logical data structure of the input files 
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All possible combinations of keys — absent or present — are accounted 
for. This can be seen in the diagram in Figure 3.32. 


e Records present on the master are 001, 002, 003, 005, 007. 
e Records present on the transaction are 003, 004, 007. 

e Records absent on the master are 004, 006. 

e Records absent on the transaction are 001, 002, 005, 006. 


The aim of the computer program is to merge these structures; so a more 
representative DSD is first produced by combining them to produce a 
new logical DSD. The diagram in figure 3.33 shows the data 
combinations and is much more useful. 


New Logical Input Structure Output Structure 


Updated 
master 
file 


Merged 
input data 
structure 


* 
Records 


O} |New O O 
Matched master Master 
record record record 


Figure 3.33 Matching input and output logical structures 


Possible 
matching 
ke 


The program specification does not include the situation where there are 
no records to be processed on either the transaction or the master files, 


so the final matching diagrams for input and output data structures are as 
shown in Figure 3.34. 
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Merged 
Structures 


MandT O Monly O 


Tonly O MandT O 
present present 


present absent 


Figure 3.34 Combined logical data structures 


A preliminary structured program can now be clearly seen. This can now 


be used to determine the processing details that will be required at each 
level. 


Update 
program 


Process 
input files 


Possible 
matching 
ke 


Process Process 
T only M onl 


update insert 


Figure 3.35 Program structure diagram 


3-47 


Chapter 3 — Data Analysis and Problems Programming Methods 


3-48 


Study Note 


This stage can also produce further refinements of the final program 
structure. This is not uncommon in programming techniques where the 
final details remind the programmer of earlier considerations and may 


draw attention to processing difficulties not obvious at an earlier stage. A 
formal design technique must be considered merely as a tool and 
excessive rigidity in application can obstruct the intended usefulness of 
the design process. 


Having produced a preliminary structure, the next step is to produce a 
detailed design. This is a systematic procedure for identifying elementary 
program operations, i.e. conditions or actions that can be converted to 
one (or only a few) lines of code. 


The first consideration is the conditions to be applied to the iterations and 
selections, (i.e. what determines the conclusion of an iteration and which 
conditional tests apply to the selections). These conditions are determined 
in a top-down manner, progressing downward through the levels of the 
program structure. The process will be illustrated by applying it to the 
update program whose structure you have already determined. 


On examining the preliminary structure, the first conditional situation 
encountered is the need to control the input of the records from the input 
files. When is this process terminated? As all the old master records are 
to be amended (or copied) and any new transactions are to be made into 
new master records, the condition occurs when an end of file (EOF) record 
is reached for both files. 


To be systematic, the possible conditions are numbered as they arise. In 
this case they are: 


C1: EOF record for the master file 
C2: EOF record for the transaction file 


The next conditions to be determined are the selections which will 
activate the amending, inserting or copying of records. The records are 
processed by reading in a transaction record and then a master record, 
and comparing the keys. 

The possibilities are: 


e the transaction and old master record keys match; 
e the keys do not match. 


In the latter case, either: 


e _ the transaction key is greater than the master key; 
e or, the transaction key is less than the master key. 
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C5: transaction key > master key 
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These simple conditions can now be combined to provide the iteration 


control constructs for the program. 


Update 
program 
Process 
input files 


Initialising 


Possible 
matching 


NOTC1LAND NOTC2OR 
NOT C2 AND 


C1OR 
(NOTC1AND (NOT C2 AND C5) 
C3 C4) 


Figure 3.36 Program structure 


Iteration Control List 


WHILE NOT C1 OR NOT C2 


UNTIL C1 AND C2 


IF NOT C1 AND NOT C2 AND 
C3 


IF NOT C2 OR (NOT C1 AND 
C4) 


IF C1 OR (NOT C2 AND C5) 


UNTIL C1 AND C2 


KEY 


C1: EOF record for the master file 

C2: EOF record for the transaction file 
C3: transaction key = master key 

C4: transaction key < master key 


C5: transaction key > master key 


While one of the files has not 
reached EOF 


Until both files reach EOF record 


If both files not ended and match 
occurs 


If transaction not ended or (master 
not ended and transaction key< 
master key) 


If transaction ended or (transaction 
not ended and transaction key > 
master key) 
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The Action List 


The next step is to consider the actions needed to complete the 
algorithm. In addition, routines are needed for the initialisation and 
termination of the program. The complete list is as follows: 


Initialisation: 

1 The transaction file and the old master file must be opened for 
reading. 

2 The new master file must be opened for writing. 


Termination: 


3 Close transaction file. 

4 Close master file. 

5 Close new master file. 

6 Stop program. 

Input: 

7 Read master file record. 

8 Read transaction file record. 
Output: 

9 Write new master record. 
Processing: 

10 Update existing master record. 
11 Create new master record from transaction record. 
12 Copy old master record. 


Further consideration of the actions required to complete the program 
shows that the program will not be able to proceed unless there are 
records available for key comparison at the start of the program. 


It is necessary therefore to read the first two records ahead of the 
iteration stage, hence an initial read of both transaction and master is 
required. Actions 7 and 8 are therefore inserted into the diagram at the 
initialisation stage. It is also necessary to read in another record 
whenever one is processed, so that comparisons are still valid. Thus, 
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actions must be taken to read the next records in the respective files in 
the processes amend, insert and delete. The final structure diagram is 
shown in Figure 3.37. 


Update 
program 


Process ; 
Initialising input files Closing 


7 8 


Figure 3.37 Final program structure diagram with conditions and operations 
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1 Learning Outcomes 


At the end of this chapter you will be able to: 

e Describe how procedures, functions and subroutines form the building 
blocks. 

e Differentiate between local and global variables. 


e Use procedure parameters. 


2 Introduction 


This chapter introduces further programming techniques related to: 


e Program structures; 


e Data structures. 


2.1 Program Structures 


These are the forms in which program components are constructed, 
organised and interrelated. The concepts associated with structured 
programming techniques are introduced, in particular the use of 
procedures and functions and the use of variables throughout the 
program. 


2.2 Data Structures 
This chapter explains what data structures are. It introduces the more 


important and common data structures and explains their use in 
programming. Abstract data types are explained, including: 


e queue; 
e — stack; 
e graph; 
etree. 


The most common programming methods related to these structures are 
described, and examples are provided. 


Arrays, lists and linked lists are implementation mechanisms which allow 
the abstract data types to be stored in the computer. These are also 
explained and the programming methods described, with examples 
provided. The problems associated with sorting and searching data are 
also examined. 
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3 Procedures and Functions 


3.1 Introduction 


The top-down design process essentially means splitting up a problem 
into its component parts over and over again, until a stage is reached 
where building a solution becomes much easier. 


Definition: top-down development 


An approach to program development in which progress is made by 


defining required elements in terms of more basic elements, beginning 
with the required program and ending when the implementation 
language has been reached. 


At every stage during top-down development, each of the undefined 
elements from the previous stage is defined. In order to do this, an 
appropriate collection of more basic elements is introduced, and the 
undefined elements are defined in terms of these more basic elements, 
(‘more basic’ meaning that the element is closer to the level that can be 
directly expressed in the implementation language). These more basic 
elements will in turn be defined at the next stage in terms of still more 
basic elements, and so on, until at some stage the elements can be 
defined directly in the implementation language. 


In practice, ‘pure’ top-down development is not possible. The choice of 
more basic elements at each stage must always be guided by an 
awareness of the facilities of the implementation language, and even then 
it will often be discovered at a later stage that some earlier choice was 
inappropriate, leading to a need for iteration. Compare bottom-up 
development. 


Procedures and functions are the building blocks which are needed to 
complete this process. Procedures and functions are sections of a 
program, or subprograms, which are written to perform a specific task and 
this will involve operations on data. They are written once, but can be 
used many times during the execution of the program, and each time the 
procedure or function is used, the data can be different. 
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Definition: procedure 


A section of a program which carries out a well-defined operation on 


data specified by parameters. It can be called from anywhere in a 
program, and different parameters can be provided for each call. 


The term procedure is generally used in the context of high-level 
languages; in assembly language the word subroutine is more 
commonly employed. 


Definition: function 


A program unit which, given values for input parameters, computes a 
value. Examples include the standard functions such as sin(x), cos(x), 
exp(x); in addition, most languages permit user-defined functions. A 
function is a ‘black box’ that can be used without any knowledge or 
understanding of the detail of its internal working. In some languages, 
a function may have side effects. 


The normal control sequence of the program, where instructions are 
obeyed in the order in which they were written, is interrupted by a call to 
the procedure or function. When the instructions within the procedure or 
function have been completed, control is returned to the instruction which 
called it and thus the next instruction will be the one after the original 
instruction. 


Definition: call 


To transfer control to a subroutine or procedure, with provision for 
return to the instruction following the call at the end of the execution of 
the subroutine or procedure. 


The data to be used during the execution of the procedure or function is 
passed on at the time of calling by using variables called parameters. 
These will be discussed later in this chapter, after the section concerning 
local and global variables. 


Study Note 


The difference between a procedure and a function is that a function 


will return a single value after a call, whereas a procedure will perform 
a particular task, but not necessarily return a value. As the function will 
return a value, the data type of the function must be declared. 
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You need to be aware that each structured computer language has its 
own way of providing these components. For example Pascal, which was 
designed to teach these concepts, has both procedures and functions, 
whereas C has only functions (which can be used as the equivalent of 
procedures) and the original BASIC only provided subroutines (which 
could be used to perform tasks and return values to the calling program). 


In our pseudocode we use both procedures and functions. When we want 
to use a procedure or a function, we shall simply give it a name, and call 
it. 


Definition: program unit 


A constituent part of a large program which is, in a sense, self- 
contained. 


Definition: side effects 


An effect of a program unit which is not apparent from its parameters. 
For example, altering a non-local variable or performing input/output. 


Definition: subroutine 


A piece of code which is obeyed ‘out of line’, i.e. control is transferred 
to the subroutine, and on its completion, control reverts to the 
instruction following the call. (The instruction code of the CPU usually 
provides subroutine jump and return instructions to facilitate this 
Operation.) A subroutine saves space as it occurs only once in the 
program, though it may be called from many different places in the 
program. It also facilitates the construction of large programs, as 
subroutines can be formed into libraries for general use. (The same 
concept appears in high level languages as the procedure.) 


Local and Global Variables 


In order to ensure that one procedure or function does not interfere with 
another, it is very important to keep track of the contents of variables. 
(This was mentioned earlier in the discussion of the FOR...ENDFOR 
loop.) 


The safest and easiest way to keep track of variables is to use them only 
within individual procedures, and not throughout the whole program. This 
is accomplished by the inclusion of the facility to declare variables as 
local. The values of local variables are Jost as soon as the procedure or 
function call has concluded. 
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If procedures use only local variables, they are ‘self contained’, and thus 
much easier to test and amend. This idea of self containment is a basic 
concept of object-oriented languages, and is discussed later in Chapter 5, 
Modelling Objects. 


Definition: local variable 


A term applied to variables which are only accessible in the program 


module within which they are defined, typically in a procedure or 
function body. 


Definition: global variable 


A term used to describe the scope of a variable: global variables are 
accessible from all parts of a program. 


Global variables keep any values assigned to them and although they can 
be used anywhere in the program, they can also be influenced or 
changed by any part of the program. It is for this reason that they should 
be used with great care! 


In our pseudocode, global variables will be listed at the start of the code 
and local variables within the procedure or function in which they are 
used. 


Procedures 


To illustrate the use of procedures, we will write a program to create a 
simple quiz on capital cities, basic mathematics and simple chemistry. 
This example will illustrate the use of procedures with local variables. The 
use of parameters will be covered in a later example. The diagram of the 
algorithm is shown in Figure 4.1. 


The procedure names chosen for this example are capital, maths and 
chem. 


The procedures are called from the first block of code; we have called this 
procedure main. In the pseudocode, all procedures and functions will start 
with the name and finish with the statement ENDPROCEDURE or 
ENDFUNCTION. 
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Display 
choices 


Accept 
choice 


choice 5 


Display 
“Error” 


choice 3 


CHEM 
Process 


choice 4 


end CAPITAL 
Process 


MATHS 
Process 


Figure 4.1 Diagram of the quiz program 


(There are no global variables ) 
main (The highest level procedure in the structure) 
choice is a local variable OF TYPE Integer 
WHILE choice <> 4 
bo 
DISPLAY " Choose which quiz you would like to try" 
DISPLAY " Type in the quiz number" 
DISPLAY " 1. Capital Cities" 
DISPLAY " 2. Basic Maths" 
DISPLAY " 3. Chemical Symbols" 
DISPLAY " 4. End the program" 
REPEAT 
ACCEPT choice 
IF choice < 1 or choice > 4 
THEN DISPLAY " Error : choose again" 
ENDIF 
UNTIL choice > 0 AND choice <5 
IF choice = 1 THEN Capital 
IF choice = 2 THEN Maths 
IF choice = 3 THEN Chem 
ENDDO 
ENDPROCEDURE 
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Capital (start of procedure Capital) 
answer is a local variable OF TYPE String 
score is a local variable OF TYPE Integer 
DISPLAY "What is the capital city of England?" 
ACCEPT answer 
IF answer = “London” THEN score:= score + 1 
DISPLAY "What is the capital city of France?" 
ACCEPT answer 
IF answer = “Paris” THEN score:= score + 1 
DISPLAY "What is the capital city of Japan?" 
ACCEPT answer 
IF answer = “Tokyo” THEN score:= score + 1 
DISPLAY "your score is", score 
ENDPROCEDURE 


Maths (start of procedure Maths) 
Use local varables number, score OF TYPE Integer 
DISPLAY "What is the square of 16?" 
ACCEPT number 
IF number = 256 THEN score:= score + 1 
DISPLAY "What is the square root of 81?" 
ACCEPT number 
IF number = 9 THEN score:= score + 1 
DISPLAY "What is the denary value of the hexadecimal number 21? 
ACCEPT number 
IF number = 33 THEN score:= score + 1 
DISPLAY "your score is ", score 
ENDPROCEDURE 


Chem (start of procedure Chem) 
symbol is a local variable OF TYPE string 
score is a local variable OF TYPE integer 
DISPLAY "What is the chemical symbol for sodium?" 
ACCEPT symbol 
IF symbol = “Na” THEN score:= score + 1 
DISPLAY "What is the chemical symbol for chlorine?" 
ACCEPT symbol 
IF symbol = "Cl" THEN score:= score + 1 
DISPLAY "What is the chemical symbol for lead?" 
ACCEPT symbol 
IF symbol = “Pb” THEN score:= score + 1 
DISPLAY "your score is ", score 
ENDPROCEDURE 


In the simple example above, it is clear that the main procedure is calling 
the three lower level procedures. These do not then call other procedures, 
although to solve more complex problems, further calls to other levels are 
likely. 


At the lowest level in this problem-solving technique, a procedure, 
function or subroutine is written to solve a single problem. This code can 
then be used again whenever the same problem arises. This is clearly a 
great advantage because it means that code does not have to be 
repeatedly rewritten. In fact, once a reliable solution has been found, a 
library of such solutions can be kept for use whenever they are needed. 
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This technique can be expanded further and reliable routines established 
which can be used whenever similar algorithms are required. 


Exercise 4.1 [15 minutes] 


The variable score occurs in each procedure of the quiz program, but is 


declared locally. What would have happened if score had been 
declared globally? 


Using Parameters 


The term parameter is used to describe variables which are used in a 
procedure or function to accept values passed to the procedure or 
function with the procedure call. The technical term for the value passed 
is an argument or actual parameter. The variables used within the 
procedure or function are called formal parameters. 


Definition: parameter 


Information passed to a subroutine, procedure or function. The 
definition of the procedure is written using formal parameters to denote 


data items that will be provided when the subroutine is called, and the 
call of the procedure includes corresponding actual parameters. See 
also parameter passing. 


Two practical examples to illustrate this complicated definition, are given 
below. 


Example 1 


The first example describes a procedure to output a character a number 
of times. The character and the number of times displayed will be chosen 
by the user before the procedure is called. Thus, each time the procedure 
is called, the instructions will be operating on data already chosen by the 
user. In order for the procedure instructions to be able to access this data, 
the procedure will need two parameters, one for the character and one for 
the number of times the character is to be displayed. 


The pseudocode syntax for using parameters is to put them in brackets 
with their data type written next to them. In this example, we will use 
design as the procedure name and ch and num as the two parameters. 
Thus the syntax for defining the procedure will be: 


PROCEDURE design (ch OF TYPE Character, num OF TYPE Integer) 


The variables ch and num are called formal parameters, they are the ones 
defined and used within the procedure. 
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PROGRAM Pattern 
Main Program 
choice, chosen_character OF TYPE Character 
number_of_times OF TYPE Integer 
choice:= “Y" 
WHILE choice = “Y" 
DISPLAY " What character would you like output to the screen?" 
ACCEPT chosen_character 
DISPLAY " How many times do you wish this to be sent?" 
ACCEPT number_of_times 
design (chosen_character, number_of_times) 
DISPLAY "Do you want another character? Type Y if you do" 
ACCEPT choice 
END WHILE 
ENDPROCEDURE 


PROCEDURE design (ch OF TYPE character, num OF TYPE integer) 
index OF TYPE integer 
FOR index:= 1 TO num 
DISPLAY ch, 
ENDFOR 
ENDPROCEDURE 


Note that the variables chosen_character and number_of_times are called 
actual parameters or arguments. These are the ones which are defined 
and used within the main program and contain the values which are 
passed to the procedure. 


Definition: argument 


A value or address passed to a procedure or function at the time of 
call. Thus in the BASIC statement Y=SQR(X), X is the argument of 
the SOR (square root) function. Arguments are sometimes referred to 
as actual parameters. 


Definition: actual parameter 


Information passed to a subprogram at the call. See also parameter, 
argument. 


Study Note 


There are a variety of techniques for the declaration of parameters. 
The pseudocode representation used here will need to be related to 
your chosen language. 
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Example 2 


The second example is a procedure called addup, which uses two 


parameters, and then shows how it may be used in a program. 


PROCEDURE addup(a, b OF TYPE Integer) 
answer OF TYPE Integer 
answer:=a+b 
DISPLAY“The sum is ", answer 

ENDPROCEDURE 


Whenever this procedure is needed, it is called by writing: 


addup(identifier1, identifier2) 


The identifiers are supplied by an earlier part of the program. 


procedure would be used as shown below. 


main PROCEDURE 
Use variables number1, number2 OF TYPE Integer 
DISPLAY “Enter two numbers “ 
ACCEPT number1, number2 
addup(number1, number2) 
ENDPROCEDURE 


The complete pseudocode is shown below: 


main PROCEDURE 
Use variables number1, number2 OF TYPE Integer 
DISPLAY “Enter two numbers “ 
ACCEPT number1, number2 
addup(number1, number2) 
ENDPROCEDURE 


PROCEDURE addup(a, b OF TYPE Integer) 
answer OF TYPE Integer 
answer:= a+b 
DISPLAY"The sum is “, answer 
ENDPROCEDURE 


The 


Both procedures used in the examples received data from the calling 
program and processed the data within the procedure. The actual 
arguments in the calling program have not been changed by the 
procedure. These procedures are examples of passing parameters by 
value, a one-way communication of data from the calling program to the 
procedure. An alternative, passing by reference, which involves two-way 
communication between the calling program and the procedure, will be 


addressed later in the chapter. 
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Exercise 4.2 [15 minutes] 
Explain the difference in use between actual and formal parameters. 


Illustrate your answer by reference to the pseudocode solution to 
Example 2. 


Functions 


The main difference between a function and procedure, is that a function 
returns a value to the calling program. Most programming languages 
contain a library of functions, e.g. to generate a random number, to 
calculate a variety of mathematical functions such as square root, to find 
and return values from lists. For example: 


ACCEPT X 


DISPLAY random(X) 


This short piece of code accepts the argument you input for X, generates 
a random number in the range 1 to X and returns the answer to be 
displayed on the screen. Note that functions can be used on the right 
hand side of an assignment, for example: 


random_number:= random(X) 


In our pseudocode we will write the functions using the same construct as 
that for procedures, but using the term ENDFUNCTION to show the finish 
of the code and the word RETURN to indicate the identifier sending the 
value back to the calling place. Note that the call may come from any 
other procedure or function. 


We shall now look again at the simple mathematics example program 
used in Chapter 2 and use it to illustrate the concept of parameters for 
functions, and how a value is returned from a function. 


The program steps required were: 
1. Present a menu of choices to the user; add, subtract, multiply, divide. 


2. Repeat ‘accept the choice and do the calculation’ until ‘end of 
program’ is requested. 


We will use two procedures in the program, one to display the opening 
message — instructions to the user, and the second to accept the choice 
and do the calculation. However, this time we will make the calculation 
part of the procedure a function. 
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The pseudocode for the main program and opening message procedure 
is: 


main program 
opening_message 
screen_menu 

end program 


PROCEDURE opening message 
DISPLAY "This is a simple mathematics program." 
DISPLAY " It will offer you a choice of calculation", 
DISPLAY “ to be performed on any two numbers you input." 
ENDPROCEDURE 


The pseudocode for the menu procedure is: 


PROCEDURE screen_menu 
use variables: number1, number2, result OF TYPE Integer 
DISPLAY " Enter your two numbers" 
ACCEPT number1, number2 
DISPLAY "Now choose your calculation type" 
DISPLAY “When you are finished type the number 5" 
REPEAT 
DISPLAY "1: addition" 
DISPLAY "2: subtraction" 
DISPLAY "3: multiplication" 
DISPLAY "4: division" 
DISPLAY “5: endprogram" 
ACCEPT choice 
CASE choice OF 
CASE 1: result = addup(number1 ,number2) 
CASE 2: result = subtract(number1,number2) 
CASE 3: result = multiply(number1 number2) 
CASE 4: result = divide(number1,number2) 
ENDCASE 
DISPLAY result 
UNTIL choice = 5 
ENDPROCEDURE 


The program calls a function called addup and passes two items of data 
to it as arguments. The data stored in the variables number1 and 
number2 will be used in the function to calculate an answer. This will be 
returned and stored in the identifier result. The pseudocode for the 
function is: 


Note that the data type of the 
variable answer has not been defined. 
This is because the function itself 
has been defined as Integer. This 
relates to the data type of the data 
that is returned by the function. 


INTEGER FUNCTION addup(a,b OF TYPE integer) 
answer:= a+b 
RETURN answer 

ENDFUNCTION 


The function uses local variables a and b, of type integer, to calculate and 
return an answer which is also an integer. 
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Exercise 4.3 [25 minutes] 


Write the pseudocode for the other three functions, subtract, multiply 


and divide. Do not allow the user to divide by zero and for simplicity, 
let the result of the divide be an integer. 


Summary of Procedures and Functions 
There are a number of important reasons for using procedures/functions: 


e The code can be built to reflect a structured design with procedures 
echoing the design at each level of the top-down process. 


e Tasks that are needed repeatedly within a program can be coded 
once and then called when needed. 


e Asmaller size of procedures makes them easier to test and debug, 
leading to greater overall reliability. 


e Variables can be confined to the procedure in which they are used, 
thus eliminating unexpected side effects in other parts of the program. 


Arrays are simply a sequence or list of data items, all of which are the 
same data type. Each data item can be referenced by its position in the 
sequence. The normal programming method of referencing an array is to 
use the array name, followed by a subscript containing the element 
number. In mathematics, the third element of an array called marks 
would be referenced as marks3, but in programming, it would be 
referenced as marks(3) or marks[3] depending on the language. We will 
use the square brackets as our convention. 


The array is stored in the computer in adjacent memory locations and 
each memory location would have an identifier composed of the array 
name followed by the appropriate subscript. Arrays can be one or two- 
dimensional (or more). Some programming languages use simple lists 
instead of arrays. The reference used will be different but the principles 
are the same. 


Definition: arrays 


An ordered collection of a number of elements of the same type, the 
number being fixed unless the array is flexible. The element of one 


array may be of type integer, those of another array may be of type 
real, while the elements of a third array may be of type character string 
(if the programming language recognises compound types). 


V1.1 4-15 


Chapter 4 — Further Programming Techniques Programming Methods 


4.1 


4-16 


Each element has a unique list of index values which determine its 
position in the ordered collection. Each index is of a discrete type. The 
number of dimensions in the ordering is fixed. 


Definition: index 


A list of values of a particular data item contained in a record, enabling 
it to be retrieved more rapidly than by simple serial search. For 
example, a subscript is a value, usually integral, that selects a 
particular element of an array. 


Definition: subscript 


A means of referring to particular elements in an ordered collection of 
elements. For example, if R denotes such a collection of names then 
the ith name in the collection may be referenced by Ri (i.e. R subscript 
i). This printed form is the origin of the term, but it is also used when 
the “subscript” is written on the same line, usually in parentheses or 
brackets: R (i) or R[i]. See also index, array. 


One-Dimensional Arrays 


This type of array is a sequence of data items where each item or element 
can be referenced by the position it occupies in the sequence. The array 
elements can be simple or complex data types, but they must all be the 
same, i.e. homogeneous. 


Definition: one-dimensional array, or vector 


A one-dimensional array, or vector, consists of a list of elements 
distinguished by a single index. 


If vis a one-dimensional array and / is the index value, then wi refers to 
the ith element of v. If the index ranges from L through to U, then the 
value L is called the lower bound of v and U is the upper bound. 
Usually in mathematics, and often in mathematical computing, the 
index type is taken as integer and the lower bound is taken as one. 


Example 1 


A list of numbers can be defined as an array, for example, the 
examination marks for a student. An array named marks contains 23, 34, 
45, 66 in that order. Each element can now be identified as: 


23 is marks[1] , 34 is marks[2], 45 is marks[3] and 66 is marks[4]. 
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Name is defined as an array of characters such that if the word James is 
stored in the array called name, then “J” is name[1], “a” is name[2], “m” is 
name[3], “e” is name[4] and “s” is name [5]. 


Exercise 4.4 [10 minutes] 


A deck of 52 cards (no joker) is sorted into the suit sequence of hearts, 
clubs, diamonds and spades, with each suit in ascending order of 
value, from 2 to 10, then Jack, Queen, King and Ace. Defining this 
structure as an array called cards, answer the following questions: 


a) Name the cards identified by cards[3], cards[51], cards[27]. 


b) What are the identifiers for the Queen of Hearts, the Jack of 
Diamonds and the Ace of Spades? 


The following pseudocode is for a program which will read in a sequence 
of 10 numbers and then print them out in reverse order. 


PROGRAM reverse 
Use variables: 
numbers[10] : ARRAY OF TYPE Integer 
count, n OF TYPE Integer 
FOR (n:=1, n=10, +1) 
Display “ Enter a number” 
ACCEPT numbers[n] 
ENDFOR 
DISPLAY “ The numbers you entered in reverse order are" 
FOR (count:=10, count=1, -1) 
DISPLAY numbers[count] 
ENDFOR 


endprogram 


One-dimensional arrays can be used with other one-dimensional arrays 
where the data in each array is of a different type. For example, the 
examination marks for students in a class. We could use a one- 
dimensional array called names where the data items are of type 
character, and another one-dimensional array called marks which 
contains data of type integer. In this case, the subscript in each array 
could always refer to the data for the same student, as long as this 
method was also used on data entry. It is simply a matter of ensuring that 
both lists are in the same order. 
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If, on data entry, the name of the student is added to the next element in 
the name array and the corresponding mark is entered in the next 
element of the marks array, the two lists will correspond. 


If the marks list is then sorted into ascending marks order, resulting in the 
data items moving place in the marks array, then the elements of the 
name list must also be changed accordingly. 


For example: 


|__subscript___ | namesarray | marks array | 


Figure 4.2 Two arrays showing name of student and marks 


Example Program 


We will write the pseudocode to input the name and examination mark for 
each student in a class of ten. At this stage, the marks will just be 
displayed. 


Later in this chapter you will be sorting the marks into descending order 
and displaying a list of student names and marks starting with the top of 
the class. You will also learn how to display the marks for a particular 
student by looking up the name in the names array and displaying the 
matching element in the marks array. 


PROGRAM enter and display examination results 
Use variables: 
marks[10]: ARRAY OF TYPE Integer 
names[10] : ARRAY OF TYPE String 
count, n OF TYPE Integer 
FOR (n:=1, n=10, +1) 
DISPLAY “ Enter the student's name” 
ACCEPT names[n] 
DISPLAY “Enter the mark for ", names[n] 
ACCEPT marks[n] 
ENDFOR 
DISPLAY “ The examination results are” 
FOR (count:=1, count =10, +1) 
DISPLAY names[count], marks[count] 
ENDFOR 
endprogram 


If an array of marks was defined as a maximum of 100, then this could be 


changed to a procedure to enter and display the marks. The actual 
number of elements in the array would be determined during the data 
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entry part of the procedure, e.g. the user could enter the number of marks 
to follow, or the program could count them until a terminating value was 
entered. This parameter would be updated by the procedure and passed 
back to the main program, where it could be used in every call statement 
to a procedure which accesses the array. 


This is an example of parameter passing by reference rather than by 
value. This is like a two-way communication between the calling program 
and procedure, in that the data can be changed by the procedure and 
then accessed by the calling program. You will be using reference 
parameters in Exercise 4.5. 


Definition: flexible array 


An array whose lower and/or upper bounds are not fixed and may vary 
according to the values assigned to it. See also string. 


Definition: string 
Any one-dimensional array of characters. 


Exercise 4.5 [15 minutes] 


Amend the pseudocode in the example to provide an algorithm for a 
procedure to enter each student’s name and mark for a class of any 
number up to a maximum of 100. Write the statement which will call 
the procedure and remember to validate the number of marks to be 
entered into the array. 


Two-Dimensional Arrays 


Two-dimensional arrays have two subscripts to identify an element of the 
array. They can be thought of as the rows and columns in a table. This is 
rather similar to the table produced in Figure 4.6, but you could not use 
that table as a two-dimensional array because the names data elements 
were of type String and the marks were of type Integer. 


All the elements of a two-dimensional array must be of the same data 
type. To illustrate two-dimensional arrays, think of an array of words such 
as a list of first names. As a word is an array of characters, then an array 
of first names is in fact a two-dimensional array and in some languages 
has to be declared as such. The convention in this case is to use two 
subscripts. We shall illustrate this with a practical example. 
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The array is called name_list and consists of five names of no more than 
15 characters: 


james, janet, alexander, lee, zacharia 


If we place them in a table (as in Figure 4.3) you can clearly see the 
relationship between the two-dimensional array and the rows and 
columns. The row number would refer to the one-dimensional array for 
one name where each element is a character in the name. 


COLUMNS 


pi je2tatatsiet7 isis | io 


Figure 4.3 Two-dimensional array of names and characters 


The letter ‘m’ in the name james can be precisely located as name _list[1, 3]. 
Similarly the letter ‘z’ in name zacharia is name_list[5, 1]. 
Note that the ‘s’ in the name james would be referenced as name_list[1, 5]. 


It is important to know how the two-dimensional array will be referenced, 
in terms of whether the row or column subscripts come first and you must 
identify this for the programming language you will be using. We will 
reference the row number first, so name_list[1, 5] will always mean row 1, 
column 5 in the array. 


A process to sort these names into alphabetical order would have to 
compare each name, character by character, to determine the appropriate 
ranking. It would not be sufficient simply to compare each first letter, since 
these may be the same, as in the example where james and janet can 
only be ordered on a comparison test when the third letters are compared. 


i.e. name_list[1, 3] < name_list[2, 3]. 


In computing, matrices are usually considered to be special cases of n- 
dimensional arrays, expressed as arrays with two indices. The notation for 
arrays is determined by the programming language. The two dimensions 
of a matrix are known as its rows and columns; a matrix with m rows and 
n columns is said to be anm xn matrix. 


Definition: matrix 


A data array of two or more dimensions. 
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Exercise 4.6 


COLUMNS 
1 2 3 4 5 


R 1 
o2{27  |4 fe 1 
w3 {sje ott 
s.4]4 et) sts 


The multiplication table above shows the 5 times table for the numbers 
1 to 4. Write the pseudocode to calculate the 12 times table for the 
numbers 1 to 15. 


HINT: Use a two-dimensional array with rows 1 to 15 and columns 1 to 
12. Use the fact that the row and column (subscript) values can be 
used in the calculation. You will be using nested FOR ... ENDFOR 
loops. 


Tables: Arrays of Records 


A table of records can be considered as a two-dimensional array in which 
the rows represent the records, and the columns represent the fields. 
Note that all data items of an array have to be of the same type. 


In Figure 4.4 for example, all data items are of type character so we can 
treat the contents of the table as a two-dimensional array called records. 


[First Name_| [Grade 
Records 


Figure 4.4 Records and Field names in a two-dimensional table 


Exercise 4.7 [15 minutes] 


With reference to the data in the table (Figure 4.4), answer the 
following: 


a) What data will be retrieved from elements: records[1, 1], records[4, 
A], records[3, 2]? 


b) What are the identifiers for David Smith’s job type, Eric Teo’s grade 
of post and David Smith’s first name? 


c) If another record was added for Elizabeth Hardy who works as a 
manager at a7 grade, what identifiers would be used to store the 
data? 
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However, we could define the record structure by identifying the fields 
within the record and then the array will reduce to a one-dimensional 
array where the subscript relates to the record number. The advantage of 
this method is that all the fields do not have to be of the same data type. 


A record is a data structure in which there are a number of named 
components, called fields, not necessarily of the same type. It may have 
variants in which some of the components, known as variant fields, are 
absent; the particular variant for a given value would be distinguished by a 
discriminant or tag field. 


The record is widely recognised as one of the fundamental ways of 
aggregating data (another being the array) and many programming 
languages offer direct support for data objects that take the form of 
records (see structured variable). Such languages permit operations 
upon an entire record object as well as upon individual components. 


Pseudocode for the Record Data Structure 


The record will be given a name and each field of the record will be 
declared with a name and data type. The one-dimensional array will then 
be defined by providing an identifier, identifying the number of elements 
and declaring that the array is based on the record structure. The fields 
can be referred to individually as in the following example. 


To declare a record called car with fields, make, size, and colour. 


RECORD car 
HAS FIELDS make: String 
size : Integer 


colour : String 


To declare an array which contains 20 of these records: 


types_of_car [20] ARRAY OF TYPE car 


To refer to each of the fields individually in pseudocode we would type, for 
example: 


types_of_car[3].make to refer to the make field in the 3rd record 
types_of_car[1].size to refer to the size field in the 1st record 
types_of_car[9].colour to refer to the colour field in the 9th record 
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Exercise 4.8 [20 minutes] 


Write the pseudocode to define the record data structure and array for 
the personnel data concerning jobs and grades shown in Figure 4.4. 


Figure 4.2 displayed the names and marks for a class of students and this 
data was previously manipulated using two one-dimensional arrays. The 
record data structure could now be applied to this problem. 


Exercise 4.9 [20 minutes] 


Write the pseudocode to define the record data structure and array for 


the student examination marks shown in Figure 4.2. 


Assume a class size of 25. 


Now that you know how to define arrays of records, it is time to 
incorporate them into programs. We will use the record structure you 
defined for the answer to Exercise 4.9. 


Example Program Using Arrays of Records 


A procedure is required to output the name of the student with the highest 
mark from a series of names and marks entered at the keyboard and 
stored in an array of records. There are currently 12 members of the class 
and the class size will never be more than 100. The procedure is called by 
the statement highest (results, 12). 


The algorithm for finding the highest mark is: 

1. Store the subscript of the highest mark in an identifier, highest. 

2. Assume initially that the first mark is the highest and every time you 
find one higher than this, make that one the highest by saving its 


subscript in the identifier highest. 


3. When every element in the array has been checked, the elements 
student[highest].name and student[highest].mark can be displayed. 
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PROCEDURE highest(classlist, number_of_students) 
RECORD student 
HAS FIELDS name: String 
mark: Integer 
classlist[100] ARRAY OF TYPE student 
n, number_of_students, highest OF TYPE Integer 
-- enter results 
FOR (n:=1, n=number_of_students, +1) 
DISPLAY “ enter student name” 
ACCEPT student[n].name 
DISPLAY “ enter student mark" 
ACCEPT student[n].mark 
ENDFOR 
-- find highest 
highest :=1 
FOR (n:=2, n=number_of_students, +1) 
IF student[n].mark > student[highest ].mark 
THEN highest:=n 
ENDIF 
ENDFOR 
-- display highest mark 
DISPLAY" The student with the highest marks is " student[highest].name 
DISPLAY" The mark was" student[highest ].mark 
END PROGRAM 


Exercise 4.10 [15 minutes] 


This procedure could be made more useful if the number of students 
could be determined in the procedure at the same time as the array 
data was entered. Assuming a maximum of 100 in a class, what 
changes would you make to the above pseudocode to add the 
following: 


Allow the user to input the number in the class before entering the list 
of names and marks. Validate this number — it must be between 1 and 
100 inclusive. If the user inputs a number outside the range, provide 
an error message and ask for another number. Write an example of 
the calling statement to this procedure. 


5 Arrays: Sorting and Searching 


The techniques of external sorting will not be covered in this workbook. 
The following section deals with internal sorts, i.e. performed in the 
memory of the computer. 


Definition: sorting 


Rearranging information into ascending or descending order by means 
of sort keys. 
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Sorting may be useful in three ways: to identify and count all items with 
the same identification, to compare two files, and to assist in searching, 
as used in a dictionary. An internal sorting method keeps the information 
within the computer’s high speed RAM; an external sorting method uses 
backing store. There are a wide variety of methods. 


There are many different algorithms for sorts and this section only gives 
examples of three different types. The algorithms for sorting will work just 
as well for arrays of tables as for one-dimensional arrays. The different 
sorting strategies will be illustrated by the use of lists, for presentational 
simplicity. The following sorts will be looked at: 
e Selection sorts: 

— selection sort using two arrays; 

— straight selection sort. 
e Straight insertion sort. 


e Exchange selection sort, often called a bubble sort. 


Definition: sort key 


The information, associated with a record of information, that is to be 


compared in a sorting process. It follows that the sort keys must be 
capable of being ordered, i.e. two keys k1 and k2 are such that k1 < k2, 
k1 = k2 and k1 > k2. 


Selection Sorts 


Example 1: A Selection Sort Using Two Arrays 


This sort uses two arrays, the original one and an extra one which is 
originally empty but is then gradually filled with data items in the required 
order. The technique is outlined below: 


e Look through the original array to find the lowest element (for a sort in 
ascending order). 


e This element is written to the empty array in position one, and the 
element in the data array is marked by overwriting the key with a 
rogue value, i.e. one which is clearly identified as not belonging to the 
actual data (See definition). 


e The search is repeated finding the next lowest key (ignoring rogue 
values) and writing this in position two in the second array, and 
overwriting as before. Each of these searches, where the whole array 
is looked at, is called a pass. 


e The process continues until the sort is complete. 
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Note that this would be a very inefficient sort for large arrays of data. 
Figure 4.5 shows the contents of the two arrays after each pass during 
the search. The initial state of the data array is 205, 310, 45 and the extra 
array is empty. 


After pass 1 After pass 2 After pass 3 


Data Extra Data Extra 
Array Array Array 


205, | jg9g, 45999 
ae [310 | 205 | 999 
45 | jg99 | Ci 999 


Figure 4.5 Table showing the contents of elements of the array after each pass 


Definition: pass 


A single scan through a body of data, for example by a compiler 
reading the program text or a statistical package reading its data. 


Definition: rogue value 


A value added at the end of a table that can be recognised as a 
termination signal by a table lookup program. 


Example 2: Straight Selection Sort 


The following description outlines an algorithm for a selection sort using 
an exchange of variables which avoids the use of the extra array needed 
by the previous method. In this straight selection sort: 


e The array is scanned to find the location of the smallest element. This 
element is then exchanged with the first member of the array. This is 
called the first pass, the whole data has been looked at to find the 
smallest item. 


e The array is scanned again, starting at the second member. The 
location of the smallest remaining member is determined, and the 
element at this location exchanged for the second member of the 
array. This is called the second pass. 


e The process continues until all array elements have been scanned 
and placed in their correct position. 


This will involve using nested loops. The outer loop controls the pass, and 
the inner loop compares the elements from that point to find the lowest 
element. 
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Definition: straight selection sort 
A sorting algorithm based upon finding successively the record with the 


largest sort key and putting it in the correct position, then the record 
with the next largest key etc. 


Pseudocode Solution for a Straight Selection Sort 
This algorithm is to sort an array of size n using a basic selection 


technique. 
VARIABLES | Note: pass <n ! 
numbers[n]: ARRAY OF TYPE Integer , An array of 20 would only 
count, pass, lowest,temp OF TYPE Integer 1 require 19 passes. At the! 
FOR (pass:= 1, pass = n - 1, +1) —______ > | end of the 19th pass, the 1 
count:= pass + 1 1 19th and 20th element 
lowest:=pass ' would be in the correct 
FOR (count:= passti, count =n, +1) 1 position 


IF numbers[count] < numbers[lowest] 
THEN lowest:= count ea eo 


ENDFOR exchange the contents of 


I 
temp:= numbers[pass] ! 
= I I 
numbers[pass] := ranberstwest] Le Mirae as oe ) 
numbers[lowest] := temp ' lowest element, using temp! 
ENDFOR ; 
I 


1. as temporary storage. 
! 


Figure 4.6 shows the contents of the elements of the array after each 
pass and Figure 4.7 shows the contents of the variables pass, count, 
lowest and temp during the four passes. 


Array Array contents after 


subscript 1" pass 


Figure 4.6 Table showing the contents of each array element after each pass of a 
straight selection sort 


Notice how the data item 310 was moved a number of times during this 
sort before it arrived in the correct place. This sort is quite useful if the 
data is already in a reasonable sequence. 


Variable contents during 


nd 
2,3,4,5 
lowest 


Figure 4.7 Table showing the contents of the variables used during each pass of a 
straight selection sort 
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Exercise 4.11 [25 minutes] 


a) Write the pseudocode for a program which accepts six numbers 
from the keyboard, stores them in an array, sorts them using a 
straight selection sort and then displays them in ascending order. 


b) Using test data of 56, 34, 78, 45, 89, 23, create a table showing the 
contents of the elements of the array as shown in Figure 4.6. 


Straight Insertion Sort 


This is an effective sort when sorting small arrays, but rather slow when 
sorting larger arrays. The algorithm is: 


e On the first pass, the first two elements of the array are compared and 
arranged in order. 


e On the second pass, the third element is compared with the first two 
and inserted into the correct position. This involves moving the third 
element into the correct position and shuffling all the other elements 
along. 


e The process is repeated on each pass, comparing the current 
element with the previous ones to see where its position should be, 
until every element in the list has been placed in its correct position. 


Definition: straight insertion sort (sifting technique, sinking 
technique) 


A sorting algorithm that looks at each sort key in turn and, on the basis 
of this, places the record corresponding to the sort key correctly with 
respect to previous sort keys. 


Before looking at the pseudocode, study the table Figure 4.8 which shows 
the contents of the elements of the array after each pass. The original list 
of numbers is 21, 16, 35, 47, 19, 12. 


Initial Array | First Second Third Fourth Fifth 
Pass Pass Pass Pass Pass 
(pass:= 2) (pass:= 3) (pass:= 4) (pass:= 5) (pass:= 6) 
21 16 


16 16 16 12 
16 21 21 21 19 16 
35 35 35 35 21 19 
47 47 47 47 35 21 
19 19 19 19 47 35 
12 12 12 12 12 47 


Figure 4. 8 Table showing the contents of the array elements after each pass of a 
straight insertion sort 
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Notice that in the 4th pass, 19 is moved from the 5th element to the 2nd 
element and the old elements 2, 3 and 4 are moved along the array by 
one subscript position. Thus the 5th element has been inserted in its 
correct position in the array. 


The following pseudocode is the algorithm for the straight insertion 
process shown above, assuming the numbers are stored in the array 
show. 


VARIABLES 
show[6]: ARRAY OF TYPE Integer 
position, pass, copy OF TYPE Integer 
FOR (pass: = 2, pass = 6, +1) 
copy := show[pass] 
position := pass 
WHILE position > O AND show[position - 1] > copy 
show[position] := show[position - 1] 
position := position - 1 
ENDWHILE 
show[position] := copy 
ENDFOR 


Exercise 4.12 [10 minutes] 
Using test data of 56, 34, 78, 45, 89, 23, provide a table showing the 


elements of the array during the processing of a straight insertion sort, 
as shown in Figure 4.8. 


Exchange Selection Sort 


This is called a bubble sort because the largest element rises to the top of 
the array (i.e. the element with the lowest subscript) in the first pass, and 
then the next largest bubbles up to the next position in the second pass, 
and so on. Of course the array could be sorted in descending order, in 
which case the smallest element would be the one ‘bubbled’ to the top. 


Definition: bubble sort (exchange selection) 
A form of sorting by exchanging, which simply interchanges pairs of 


elements that are out of order in a sequence of passes through the file, 
and continuing until no such pairs exist. This method does not compete 
with straight insertion. 


The algorithm is simple and effective, particularly when none of the 
elements are far from their final position and there are only 20 — 30 
elements. The algorithm is: 
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e On the first pass, the first two members of the array are compared, 
and exchanged if necessary. The process is repeated with the second 
and third elements, and then the third and the fourth, and so on until 
the largest element arrives at the top, i.e. in the element with the 
highest subscript. 


e Now that one element is in the correct position, the process is 
repeated until the next largest is in the next to the end position. 
However, it is not necessary to include the last element, as that is 
already in the correct position. Thus during each successive pass, 
one less element than in the previous pass is considered. 


e This is repeated until every element has been sorted. This will 
happen when the complete cycle of passes has ended, or as soon as 
no more exchanges have taken place. This obviously means that a 
record has to be kept of whether any exchanges have taken place 
during a pass (if not, then all elements must be in the correct order). 


Assume that the original list is 3, 1, 2, 5, 4. 


subscripts contents 
First pass 
a} 3] 1[2[5]4 
Compare 
1st<=2nd 1st > 2nd 
contents: 
is (3 > 1)? 
contents 
No 1st and 2nd 
elements swap 
places via temp 
subscripts contents 
corm fat2[ai4]s) = L4/3] 2] 5] 4 
Compare 


2nd<=3rd 2nd>3rd 


contents: 
is (3 > 2)? 


contents 
2nd and 3rd 


seowee, t+(1 [2] 3] 5]4 
places via temp 


Figure 4.9 Diagram showing the first two steps in an exchange selection (bubble) 
sort 
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The pseudocode for a bubble sort of an array of 10 integers is: 


main program 
Use local variables 
example [10]: ARRAY OF TYPE Integer 
exchange_count, pass, temp OF TYPE Integer 
pass:=1 
REPEAT 
exchange_count:= O 
FOR (n:=1, n = 10 - pass, +1) 
IF example[n] > example[n+1] 
THEN temp:= example[n] 


ee ee Pe 
example[n] := example[n+1] 1 these three statements 
example[n+1] := temp , exchange elements n and | 
exchange_count := exchange_count+1 1 nel 
BIDE 8 ee | me aa gee ae 
ENDFOR Binns a asked maT 
pass:= pass+1 1 this statement records , 
UNTIL exchange_count= 0 , the number of exchanges: 
endprogram 1 that have taken place in | 


| this pass (only need to 
know whether > 0) 


I 
a a a ee ee 


To illustrate the logic behind this pseudocode, consider a sequence of 
numbers 41, 23, 42, 32, 67, 53, 46, 99, 81, 73. Figure 4.10 shows the 
contents of the elements of the array at the end of each pass and how the 
variables change during the loop. 


After 1° pass | After2nd pass | After 3™ pass 


Variables used 
during loop 


a ae 
ai 


1, 2,3, 4, 5, 6, 


Thus the program will end after the 3" pass as exchange_count = 0 


Figure 4.10 Table showing contents of elements and variables during each pass 
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Exercise 4.13 [10 minutes] 


Using test data of 56, 34, 23, 45, 89, 78, create a table showing the 
elements of the array during the processing of an exchange selection 
(bubble) sort, as shown in Figure 4.10. 


Searching 


Finding one particular element in an array is simply a matter of looking 
through the array, element by element, until the required key is found. 
This can be done sequentially, i.e. starting from the first element and 
looking for a matching key until one is found. For example: 


IF key_to_be_ found = array[n] THEN found := TRUE 


The identifier found is of type Boolean. This means that found can only 
have two values, either O or 1, where 0 is the same as FALSE and 1 is the 
same as TRUE. They are used by programmers as switches, to indicate 
one of two states. This can be useful in the logic of the program, as it 
allows the programmer to record the state of things in certain situations; 
as in this case where it is used to record the fact that the item has been 
found. The identifier used can then be tested in other parts of the 
procedure or may be returned through a function. 


The pseudocode for this type of search is shown below: 


-- note that key_to_be_found and key_array would contain data at 
-- this point of the program. The key_to_be_found data could have 
-- been input over the keyboard 
Use local variables 

key_array: ARRAY [10] OF TYPE String 

key_to_be _found OF TYPE String 

count, n OF TYPE Integer 

found OF TYPE Boolean 
begin sort procedure 


Count := 1 
found := FALSE 
REPEAT 


IF key-to-be-found = key_array[count] 
THEN found := TRUE 
ELSE count :=count+1 
UNTIL found = TRUE OR count >10 
-- the program would then continue 
-- taking appropriate actions with the item found - element count 
-- or a “not found" result 


As it is important to also determine if the particular key is present, it may 
be necessary to search the entire array. This method is clearly inefficient 
(imagine the number of names in a telephone book), although for small 
searches this may not matter. 
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Searching a Sorted List 


A more efficient method is to start by sorting the array on the key to be 
found. A test can then be included to stop the search when the key is 
matched or when the key reached is greater than the key to be found. 
This would involve: 


e Processing the array, one element at a time, in a loop. 


e Comparing the key_to_be found with each successive element in the 
array and stopping the loop when a match is found, as described in 
the previous example. 


e In addition, checking whether the current element in the array is 
greater than the key_to_be found. If this is the case, then it means 
that you have passed the position where key_to_be found should be 
and thus can conclude that the key_to_be found is not there. For 
example: 


IF key_to_be_ found > array[n] THEN .... 


Consider the list of names used earlier in the chapter in Figure 4.2, 
reproduced here for your convenience. 


|__subscript___— | namesarray | marks array | 
Ratcliffe N 


Sutcliffe B 


Copy of Figure 4.2 Two arrays showing name of student and marks 


The names are in sorted order so can they be used to illustrate the search 
method described above. 


If the name “Smith J” was entered via the keyboard then 
key_to_be_found:= “Smith J”. “Smith J” is greater than “Ratcliffe N”, but is 
less than “Sutcliffe B”. If you were doing the search, you would stop as 
soon as you found an entry higher than the required one. This is exactly 
how the search algorithm works. As the list is supposed to be in 
alphabetical order (i.e. sorted), then it is safe to assume that “Smith J” is 
not in the list. 


This is the pseudocode algorithm for the search of the sorted array of 10 
key names. 
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key_array: ARRAY [10] OF TYPE String 

Use variables key_to_be _found OF TYPE String 
count, n OF TYPE Integer 
found, missing OF TYPE Boolean 


-- note that key_to_be_found and key_array would contain data at 
-- this point of the program. The key_to_be_found data could have 
-- been input over the keyboard 

begin sort procedure 


count:=1 
found:= FALSE 
missing:= FALSE 
REPEAT 
IF key-to-be-found = key_array[count] 
THEN found := TRUE 
ELSE IF key_to_be_found < key_array[count] 
THEN missing:= TRUE 
ENDIF 
count :=count+l 
UNTIL found = TRUE OR count >10 OR missing = TRUE 


-- the program would then continue 
-- taking appropriate actions with the item found 
-- ora “not found" result 


The Binary Search 


This method can substantially reduce search times when dealing with 
larger arrays. There is more processing per comparison, as a calculation 
is required, but the number of comparisons is less. 


The technique is as follows: 


1. 


Determine the start and end subscript values for the array, start and 
end, for example. 


Find the centre of the array — subscript = (Start + end)/2. 

Compare the key required with the key at this location. 

If the key required is less than the key found, the key we are looking 
for is in the first part of the array. Otherwise, it is in the second half of 
the array. 

This process, steps 1 to 4 above, can now be repeated with the 
appropriate section of the array, i.e. a new start or end subscript value 
based on the previous mid-point subscript, and a new mid-point is 
calculated. This process is continued until the key is found. 


It is, of course, possible that the key may not be found at all and an 
allowance must be made for this eventuality. 
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Pseudocode for this Algorithm 


Start is used as the lower boundary of the section currently being 
searched, end is used as the higher boundary of the section currently 
being searched. At the beginning of the search, these are the subscripts 
of the first and last elements of the array. 


Use variables start, end, middle OF TYPE Integer 
found OF TYPE Boolean 
key_required OF TYPE String 
key_name (10): ARRAY OF String 

start:= 1 

end:= 10 

found:= false 

REPEAT 

middle:= (end+start)/2 
IF key_required = key_name[middle] 
THEN found := TRUE 
ELSE IF key required < key-name[middle] 
THEN end : = middle-1 
ELSE start := middle+1 
ENDIF 
UNTIL found= TRUE OR start > end 


Exercise 4.14 [20 minutes] 


Amend the pseudocode to provide a function which will perform a 
binary search on a sorted array of variable length and return the 


subscript number of the found item in the array. Note that this means 
that you must pass the size of the array as one of the function 
arguments. Assume a maximum of 100 elements in the declared array. 


6 Linked Lists and Arrays 


Arrays and linked lists represent two different implementation 
mechanisms that allow classic data structures to be held within a 
computer program. It is possible to represent any data structure as either 
a linked list or an array. In most structured languages it is possible to 
have linked lists and arrays of any type. 


Each programming language will have a particular way of defining and 
accessing arrays. Sometimes they will be addressed differently in terms of 
whether the row or column is the first integer. You will need to make sure 
you understand the following in any programming language you learn: 

e how to declare the size and data type of the array; 

e whether or not the arrays are row or column major; 


e whether the index must be an integer. 
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It is best to use an array where the maximum size of the data to be stored 
is known or where fast access to the data can be achieved through an 
integer index. In general, the elements of an array are held in consecutive 
locations in memory. 


Linked Lists 


Lists are similar to arrays in that elements are stored sequentially in 
memory. However, whereas arrays are based on the mathematical 
concept of matrices, lists are based on the everyday concept of a 
sequence of items, for example, a ‘things to do’ list or a column of 
expense amounts to be added. The data in the list can be of any type, a 
list of names, a list of numbers, even a list of lists, and is usually enclosed 
in brackets with the elements of the list separated by a comma. For 
example: 


e A\list of names could be names:= (“Smith J”, “Harman H”, “Jones M’). 


e A\list of examination marks for one student could be marks:= (23, 45, 
68). 


e A list of exam results could be called exam_results and contain: 
(“Smith J”, (23, 45, 78), “Harman H’, (45, 67), “Jones M”, (56, 34, 89, 
78)). The list (23, 45, 78) is a sub-list of the list exam_results. 


Study Note 


Lists make better use of storage than arrays. The maximum length of 
the list does not need to be defined at the beginning of the program. 
Lists provide the ability to dynamically allocate store as a program 
executes and to add, delete and change elements in a structure. 


Definition: list 


A finite ordered sequence of items (X:, X2, X3... Xn) where n >= 0. If n= 
O, the list has no elements and is called the null list (or empty list). If n 
> 0, the list has at least one element, x:, which is called the head of the 
list (See also header). The list consisting of the remaining items is 
called the tail of the original list. The tail of a null list is the null list, as 
is the tail of a list containing only one element. 


The items in a list can be arbitrary in nature, unless otherwise stated. In 
particular, it is possible for an item to be another list, in which case it is 
known as a sub-list. For example, let L be the list (A, B, (C, D), E) then 
the third item of L is the list (C, D), which is a sub-list of L. If a list has one 
or more sub-lists, it is called a list structure. If it has no sub-lists it is 
called a linear list. The two basic representation forms for lists are 
sequentially allocated lists and linked lists, the latter being more flexible. 
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Definition: linked list 


A list representation in which the items are not necessarily sequential 
in storage. Access is made possible by the use in every item, of a link 


that contains the address of the next item in the list. The last item in the 
list has a special null link to indicate that there are no more items in the 
list. See also doubly linked list, singly linked list. 


Linked lists contain items which are not necessarily stored in sequence. 
Each item contains a link to the next item in the sequence. Thus each 
element of the list contains: 

e the data required by the programmer; 


e a pointer which points to the next element in the list in terms of 
sequence rather than storage. 


Definition: link 


A pointer that indicates the storage address of an item of data. Thus 


when a field of an item A in a data structure contains the address of 
another, item B, i.e. of its first word in memory, it contains a link to B. 
See also linked lists. 


Programming languages differ in the way that linked lists are used. Some 
provide functions which allow operations to be performed on the list while 
others have no list manipulation functions and the programmer has to 
implement the structure by using arrays. Information concerning linked 
lists can thus be separated into: 


e how linked lists work; 


e how they are used by programmers. 


How Linked Lists Work 


The most important fact is that an element in the list actually contains at 
least two items of the data. In addition to the data that the programmer 
requires to be stored in the list, at least one pointer will also be stored. 
The pointer provides the address of the next data item in the sequence. 
Thus: 


e one element in the list is linked to the next element in the list by the 
pointer; 
e an item of data plus pointer make one element in the list. 


This allows the programmer to traverse through the list from the beginning 
to the end. This is called a singly linked list. A more complicated and 
flexible linked list could have two pointers for each element, one to point 
to the next element in the sequence and one to point to the previous 
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element in the sequence. This would allow the programmer to start at any 
element in the list and either traverse forwards or backwards. This type of 
linked list is called a doubly linked list. 


An element in a doubly linked list contains at least three items of data. In 
addition to the data that the programmer requires to be stored in the list, 
at least two pointers will also be stored. The next_pointer provides the 
address of the next data item in the sequence and the previous_pointer 
provides the address of the previous data item in the sequence. Thus: 


e one element in the list is linked to the next element in the list by the 
next_ pointer; 


e one element in the list is linked to the previous element in the list by 
the previous_ pointer; 


e §©data item + next_pointer + previous_pointer make one element in the 
list. 


Note that there is a special value for all pointers, which is NIL or NUL, and 
this means that the pointer points at nothing. This is used to terminate 
linked lists. 


In order to manipulate the list, values for a start pointer and first free 
storage pointer will also be needed. 


Linked lists make better use of storage than arrays. The maximum length 
of the list does not need to be defined at the beginning of the program. 


Singly Linked Lists 


Definition: singly linked list (one-way linked list) 
A linked list in which each item contains a single link to its successor. 


By following links it is possible to access the entire structure from the 
first item. 


The following example illustrates how singly linked lists work. 


The sentence “I do not understand linked lists” could be written as a list, 
as demonstrated in Figure 4.11. 
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Linked List 


start pointer has an 


initial value of 1, so 4 
. . Address Pointer to Comment 
points to first 
element in the next 
linked list element 
5 next element in the 
start 1 sequence is element 2 
pointer 
3 next element in the 
2 sequence is element 3 
next element in the 
3 sequence is element 4 
next element in the 
4 “understand” 5 sequence is element 5 
next element in the 
5 “linked” sequence is element 6 
free null pointer indicates the 
storage 6 lists last item of data 
pointer 
empty, containing no 
ye 7 data 
free storage pointer 
has an initial value 
of 7 i.e. points to the 


address of the first 
available free 
element in the 
linked list 


Figure 4.11 A singly linked list 


If the word “not” is deleted from the list, all the elements stay in the same 
place in the list, but the third element in the list will contain no data. 


The values of the pointers will change to indicate where the next free 
space is. This is shown in Figure 4.12. 
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Linked List 
start pointer has an 


initial value of 1, so 


points to first element Address Pointer to Comment 
in the linked list next 
element delete “not” from the 
NS linked list 
start 1 
ointer = 
P 1 Position 2 
2 Pointer to next element 
changed from 3 to 
next_pointer of item 3 
3 (was 4). (this will 


ensure the deleted one is 
bypassed 


[el = | 
2 Position 3: 
Data item deleted from list 
free 3 Free storage pointer 
storag 6 changed to this deleted 
pointer one (Same as 


next_pointer in previous 
position before it was 
updated) 


/ 


free storage pointer 
still points to the 
address of the first 


available free 
element in the 
linked list 


Figure 4.12 Deleting an item from a singly linked list 


If the sentence required was “I do understand arrays and linked lists”, 
then the words “arrays” and “and” would need to be added to the list. 


As the list is reflecting the order of the sentence, we would want to add 
“arrays” to the 4th position in the list and the word “and” to the 5th position 
in the list. If the words were to be added to the list in the order “arrays” 
first and then “and”, then the first available space would be used for 
“arrays” and the second item “and” would be entered in the next available 
space in the list. 


The pointers would be updated to indicate the position of the data in terms 
of the list, i.e. the position of the word in the sentence. In this case, the 
word “arrays” should be in position 4 and “and” should be in position 5. 
Note how this is represented in Figure 4.13. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods 


start pointer has 
initial value of 1, 
points to first 

in the linked list 


\ 


start 


pointer 


free storage pointer 


still points to 


address of the first 


available free 


element in the linked 


list 


Address 


Chapter 4 — Further Programming Techniques 


Linked List 


Pointer to 
next 
element 


Comment 
Insert “arrays” into 
position 4 in the 
sequence 


new data inserted at 

free storage address (3) 
next_pointer changed to 
indicate next element (5th) 
previous 3rd element 
(‘understand’) - change 
next_pointer to address 
of new element 

free storage pointer 
changed to 7 


Insert “and” into 
position 5 in the linked 
list 

new data inserted into 
empty element (7) and 
next_pointer changed to 
5 

free storage pointer 
changed to 8 


Figure 4.13 Showing a singly linked list after insertions and additions. 


V1.1 


Exercise 4.15 [25 minutes] 


Using the table structure in Figure 4.11, which contains the original 
data, update the table to show the final values for the next pointer and 
free storage pointer when the sentence has changed from the original 
to “first_name second_name now understands pointers and linked 
lists”. Note the difference between a word’s position in the logical 
sequence and its position in storage. Make the changes in the 
following order: 


a) delete “not”; 
b) add “pointers” to the 4th position in the sequence; 


c) add “and” to the 5th position in the sequence; 


d) change “do” to “now”, change “I” to your second name, 
“understand” to “understands”; 


change 


e) add your first name to the first position in the sequence. 
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Definition: doubly linked list (two-way linked list; symmetric list) 


A linked list where each item contains links to both its predecessor and 
its successor. This makes it possible to traverse the list in either 


direction. The flexibility allowed by double linking must be offset against 
the overheads of the storage and the setting and resetting of the extra 
links involved when items are inserted or removed. 


Remember, these work in the same way as singly linked lists, except that 
each record also has a pointer to the previous element in the sequence. 
The example in Figure 4.14 of a list of numbers 47, 53, 64, 75, 86, 97, 
illustrates how doubly linked lists work. 


Doubly Linked List 


‘Address’ Pointer to Data Pointer to Comment 
previous next 
element element 


null pointer indicates the first 
m item of data. 
; rene 
; pow |: 
° pom | os 
: = 
null pointer indicates the last 
6 item of data. 
7 — Empty, containing no data. 
free 
storage 
pointer 


Figure 4.14 Diagram representing doubly linked list 
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If the number 47 is deleted from the list, all the elements stay in the same 
place in the list, but the first element in the list will contain no data. When 
items are deleted from a doubly linked list, the previous element needs 
the next_pointer to be changed and the following element needs the 
previous pointer to be changed, to indicate the change in the sequence. 
The next_free_space pointer will change to indicate where the next free 
space is. This is illustrated in Figure 4.15 by showing the effect of deleting 
elements 47 and 64 in that order. 


Doubly Linked List 


Address . : 
Pointer to Pointer to Comment 
previous next 
element element 

1 a) 47 deleted 
Element 1 - 47 deleted, 

2 previous _ pointer was null 
next_pointer was 2 
so previous_pointer of next 

3 element (2) := null 
free storage pointer changed 
tol 

4 
b) 64 deleted 

5 
Element 3 - 64 deleted 
previous_ pointer was 2 

6 next_pointer was 4 
so previous_pointer of next 
element (4) changed from 3 

7 to 2 


and next_pointer of previous 
element changed from 3 to 4 
free storage pointer changed 
to 3 


free This would be a list (or stack) where 3 is the 

storage top entry and then 1, 7, 8 

pointer Thus the free storage pointer list is 
3,1,7,8 


Figure 4.15 Diagram representing doubly linked list after deletions 


If item 66 is now added to position 2 in the list, the next_pointer of the 
previous element and the previous_pointer of the next element will need 
to be updated to reflect the new sequence. This is illustrated in Figure 
4.16. 
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Address . P 
Pointer to Pointer to Comment 
previous next 
element element 

1 66 inserted in correct position 
(2nd) at free storage pointer 
address (3) 

2 
Element 3 - 66 to be added 
previous element (2) had 

3 next_pointer of 4 and 
next element (4) had 
previous_pointer of 2 

4 
new element: 

5 next_pointer :=4 
previous_pointer :=2 

6 previous element (2): 
next pointer:=3 

7 next element (4) 


previous pointer :=3 


free_storage pointer 
changed to 1 


free This would be a list (or stack) where 3 has 
storage been removed, so 1 is now the top entry 
pointer then 7, 8 

Thus the free storage pointer list is now 1, 7, 8 


Figure 4.16 Diagram representing doubly linked list after additions 


Exercise 4.16 [10 minutes] 


Amend the table shown in Figure 4.16 to show the effect of adding 98 
and 32, in that order, to a doubly linked list. 


Exercise 4.17 [10 minutes] 


Now delete 86 from the doubly linked list. What do the next and 
previous pointers contain for the elements containing the data 75 and 
97? 


What does the free storage pointer list contain? 
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How Are Linked Lists Used by Programmers? 


The answer to this will differ, depending on whether the programming 
language used provides functions for list manipulation. If lists are not 
managed by the programming language, then the programmer has to 
provide this logic by using arrays, remembering to include the pointers 
which will in effect be array element numbers. Note that some 
programming languages (e.g. C or C++) use dynamic storage allocation 
for arrays. 


Linked List Functions 


Those programming languages which can deal with lists would have a 
number of functions provided for list manipulation. Programmers can then 
use these lists and the functions provided to create singly linked lists or 
doubly linked lists. Some of the available functions are described below, 
with some pseudocode examples of use. 


Assume a sorted list of numbers 23, 34, 45, 67 and a sentence of “I do 
not understand linked lists”. 


The functions can be categorised by use. 
Defining lists, emptying or clearing lists: 


e set numbers = [23, 34, 45, 67] will define the list and create it; 


e set sentence = [“I”, “do”, “not”, “understand”, “linked”, “lists”] will define 
the list and create it; 


e set numbers = [] will create a new empty list or clear an existing list. 
Adding or inserting elements, changing elements and deleting elements: 


e append numbers, 56 would result in 23, 34, 45, 67, 56; 


e add numbers, 56 would result in 23, 34, 45, 56, 67 where the number 
56 has been inserted into its correct position; 


e addAt numbers, 2, 30 would result in 23, 30, 34, 45, 56, 67 where the 
number 30 has been inserted into position 2. This would have been 
the function to use when we inserted the word “arrays” as the 4th 
word in the sentence in the previous example. E.g. addAt sentence, 
4, “arrays”; 


e setAt numbers 1, 15 would result in the first element, 23, being 
replaced with 15; 


e deleteAt sentence, 3 would remove the word “not” from the sentence 
list. 
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Other functions might be: 


e = providing information about the list e.g. the number of elements in the 
list, the minimum and maximum values; 


e searching lists e.g. finding the element “array” and returning its 
position in the list (4) or finding the 4th element in the list and 
returning the data item (“array”). 


It is worth noting here that a list is often implemented as an object, with 
the function as methods. 


Where the programming language cannot manage lists, the programmer 
could use arrays, where the pointer addresses are subscripts. The 
programmer must make sure that the pointers are also stored as an 
element of data, in addition to the actual data. 


Using Arrays to Manage Linked Lists 


When using arrays, an element in a linked list is usually defined as some 
sort of record, e.g. 


RECORD element 
HAS FIELDS 
datum : String 
next : integer 
list}6] AN ARRAY of TYPE element 


Note that each pointer will be an array subscript, whereas the pointers of 
linked lists contain addresses. 


Example 1 The Sentence List 
For a singly linked list the variable definitions will be: 
RECORD sentence 
HAS FIELDS 


words: String 
next_pointer: Integer 


Use variables 
sentence_list [20] AN ARRAY OF TYPE sentence 


Figure 4.17 Pseudocode definition of array record structure for singly linked list 
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Example 2 The Results List 
For a doubly linked list the variable definitions will be: 


RECORD results 
HAS FIELDS 
previous_pointer: Integer 
name: String 
exam_mark: Integer 
next_pointer: Integer 


Use variables 
result_list [20] AN ARRAY OF TYPE results 


Figure 4.18 Pseudocode definition of array record structure for doubly linked list 


The record consists of some data which can be of any type and may even 
be a pointer to another linked list or element. In this example it is string 
data. 


There are usually four pointers associated with a linked list. This however, 
depends on what data structure the linked list is being used for. These 
four main pointers are: 


1. first — a pointer to the first element in the list, i.e. the one with a null 
previous_ pointer; 


2. last — a pointer to the last element in the list, i.e. the one with a null 
next_pointer; 


3. Current — a pointer to the current element of interest; 


4. free-storage — a pointer to the first available free storage. (Note that 
the programmer will have to deal with the situation when the fixed size 
array becomes full.) 


When using arrays these could be defined as: 
first_pointer, last_pointer, current_pointer, first_available_pointer OF TYPE Integer 


There are two basic operations that can be performed on a linked list: 


e add — adds an element to a linked list. The element must be identified 
in terms of its position in the sequence (e.g. when you added your first 
name, it had to be the first word in the sentence). 


e delete — deletes an element from a linked list. This may be deletion by 
position number in the sequence or the data. 
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When using arrays, the programmer will have to write procedures to cover 
these functional requirements. Two possible procedures are: 


add (list_name, element number, data) €.g. add (sentence _list, 4, “arrays”) to add 
“arrays” to position 4 in the sequence. 


delete(list_name, element_number) €.g. delete (sentence _list, 3) to remove item 3 
from the list. 


There could also be a change procedure, such as change(list_name, 


nk 


existing_datum, new_datum) €.g. change(sentence_list, “do”, “now’). 


When using linked lists, the size of the information stored is limited by the 
size of memory available to the program and it is dynamically used. 
Obviously, if the programmer is providing the coding using arrays, this 
advantage will be lost, as the maximum size of the array must be defined. 


Exercise 4.18 [25 minutes] 


Write the pseudocode for a procedure to add an element to a particular 
position in an array which represents a single linked list. Assume a 
record of stock_code (Character), description (String) and price (Real) 
and a maximum size array of 100. The items in the list are in 
stock_code order. 


Assume that three functions exist; 


first_pointer(array) returns the subscript of the first element in the 
sequence, 


last_pointer(array) returns the subscript of the last element in the 
sequence and 


free_storage(array) returns the subscript of the next free element. 


Summary 


The choice of storage mechanism for implementing a data structure can 
be critical to the performance of a computer application. In many cases, 
the size of the data to be manipulated is unknown and hence, a linked list 
must be used. However, if the maximum size of the data is known, or can 
be determined at the start of the program, an array can often present a 
more efficient storage structure. 


Many languages do not support dynamic data storage mechanisms such 
as linked lists. In these languages, the implementation of the data 
structures must be done in an array and it is the programmer's 
responsibility to manage the size of the array. Realistically, you are 
advised to only use linked lists in programming languages which provide 
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them. The main purpose of looking at linked lists and arrays is to provide 
a better understanding of how linked lists work, rather than recommending 
their use. 


7 Data Structures: Queues and Stacks 


7.1 Introduction 


Queues and stacks are types of abstract data structures. They are 
managed by using linked lists, or arrays or lists. 


Definition: abstract data type (ADT) 


A data type that is defined solely in terms of the operations which apply 
to objects of the type, without commitment as to how the value of such 


an object is to be represented. An ADT includes both data and related 
operations, provides a means to encapsulate details whereby the data 
is completely hidden from its surroundings, and ADT operations 
provide loose coupling to the outside world via a function interface. 


7.2 Queues 


A queue is an abstract data structure in which the first element added to 
the queue is also the first element to be removed. It is often referred to as 
a FIFO (First In First Out) list. It is single threaded and the only operations 
that can be performed are to access the next element in the queue, add 
elements to the queue and remove elements from the queue. 


It is, of course, possible to have a queue of queues. 


Definition: queue (FIFO list; pushup stack; pushup list) 


A linear list which all insertions are made at one end of the list and all 
removals and accesses at the other end. 


This data structure is used whenever the processing of elements must 
occur in the order in which they arrive — for example, the processing of 
people queuing in a shop or at a bus stop. However, when a person 
leaves the front of a queue, everyone else in the queue moves forward, 
one place nearer to the front of the queue. 


When a queue data structure is used, no movement takes place as 
entries are removed from the queue. Instead, the data always stays in the 
same storage location until it is removed from the queue. The position of 
the front of the queue and the next available place in the queue are 
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recorded using pointers. When an entry is removed from the queue, the 
pointer to the front of the queue is updated. When an item is added to the 
queue, the pointer to the next available space is updated. 


Queues can be stored in the form of lists. A list used to store a queue is 
called a push-up list or push-up stack. There are only three operations 
required for handling a queue: 


e add an element to the queue; 
e access the next element; 


e remove an element from the queue. 


Programming languages which support queue handling provide these 
functions. For a linked list representation, two variables are needed, one 
for the front of the queue and one for the back. When an element is 
removed from the queue, the front pointer is moved to the next element 
and when an element is added to the back, a new element is created and 
linked to the list. Standard functions for list manipulation are: 


e deleting elements in positions in the list (in this case it would always 
be position 1); 


e appending to the end of the list. 


Thus queue manipulation falls within the bounds of normal list 
manipulation. 


Theoretically, queues could be implemented as arrays, provided that the 
absolute maximum size of the queue is known or elements can be 
stopped from entering the queue until a space is available. 


One way to implement this is to have two variables; one to point to the 
array index representing the first element in the queue and one to point to 
the array index representing the last element in the queue. 


If an element is removed from the queue, the variable representing the 
first element in the queue is incremented. If an element is added to the 
end of the queue, the variable representing the last element in the queue 
is incremented. If either variable reaches the end of the array, it is set 
back to the beginning. (This is known as a cyclic representation.) In 
practice, programmers would be advised to use this only in programming 
languages which support list manipulation. 


Definition: storage structure 


The mapping of a data structure to its implementation (which may be 
another data structure). A good choice of storage structure permits an 
easy and efficient implementation of a given data structure. 
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Exercise 4.19 [25 minutes] 


A queue contains the numbers 34, 78, 43, 12, with 34 being the first 
element in the queue. Draw a diagram to represent a queue which is 
stored in an array of 6 items. Show the changes which occur when the 
numbers 34 and 78 are removed from the queue and 26, 56 and 34 are 
added to the queue. (Number 34 has rejoined the queue but will have 
to wait its turn.) Remember to identify the values of the two pointers 
which are required. 


Do not concern yourself with how the program knows which is the next 
available free element. This question is intended to aid your 
understanding of the steps involved in queue management, and you 
will see the empty places in your diagram. 


Stacks 


A stack is an abstract data structure and differs from a queue in the way 
in which data is added and removed. Data is added to the ‘top’ of the 
stack and is also removed from the ‘top’ of the stack. This is similar to the 
removal of a playing card from the top of the pile of cards during a game, 
where the last element added to the stack is also the first element to be 
removed from the stack. This is often referred to as a LIFO (Last In First 
Out) list. It is single threaded and the only operations that can be 
performed are to push elements onto the stack and pop elements from 
the stack. 


Definition: stack 


A linear list where all accesses, insertions and removals are made at 
one end of the list, called the top. This implies access on a /ast in first 


out (LIFO) basis: the most recently inserted item on the list is the first to 
be removed. The operations push and pop refer to the insertion and 
removal of items at the top of the stack. 


When using a browser, the back button will allow users to retrace their 
navigation steps, as the previous web pages will appear on the screen in 
the reverse order to the way they were initially accessed. That is, the last 
one accessed appears first, and then the one before that, etc. This is an 
example of a stack, where the last one added to the stack is the first one 
to be accessed or removed. Another example is the list of most recent 
files provided when opening a word processing file. The most recent 
always appears at the top of the list. 
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Stacks can be stored in the form of lists. A list used to store a stack is 
called a push-down list or push-down stack. There are only three 
operations required for handling a stack: 


e add an element to the stack — called push; 
e access the next element; 


e remove an element from the stack — called pop. 


Programming languages which support stack handling provide these 
functions. For a linked list representation, only one pointer is needed, to 
identify the first element in the list. When an element is removed from the 
stack, the front pointer is moved to the next element and when an element 
is added to the stack, a new element is created in the first position in the 
list and linked to the original first item in the list. 


Standard functions for list manipulation are deleting elements in positions 
in the list (in this case it would always be position 1) and inserting 
elements at a particular position in the list (in this case it would always be 
position 1). Thus stack manipulation falls within the bounds of normal list 
manipulation. 


A stack can be implemented using arrays, in a similar fashion to queues, 
but with one major difference. 


Stacks can be implemented as arrays provided that the absolute 
maximum size of the stack is known, or elements can be stopped from 
entering the stack until a space is available. 


One way to implement this is to have a single variable to point to the array 
index representing the last element in the stack (i.e. the top of the stack). 
If the array contains 51 data elements, the last element in the stack is in 
position 51. 


If an element is removed from the stack, the variable representing the last 
element in the stack is reduced by 1. 


If an element is added to the stack, the variable representing the last 
element in the stack is incremented by 1. 


Exercise 4.20 [25 minutes] 


A stack contains the numbers 34, 78, 43, 12 with 12 being the last one 
to be added to the stack. Draw a diagram to represent a stack which is 


stored in an array of 6 items. Show the changes which occur when the 
numbers 12 and 43 are removed from the stack and then 26, 56 and 
43 are added to the stack. Remember to identify any pointers you will 
need. 
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Data Structures: Graphs and Trees 


Introduction 


Graphs and trees are abstract data structures used to describe data and 
operations in real life situations. In order to implement these structures, 
you have to choose a storage structure such as arrays or linked lists. 


Graphs 


A graph represents a multi-connected network of elements where each 
node has a link to one or more elements, for example a network of railway 
tracks between stations. In the examples given, nodes are shown by 
numbered circles and the links are shown by lines with arrowheads 
showing the direction of the link. 


Graphs can be cyclic or acyclic depending on whether any of the links in 
the graph form cycles. Figure 4.19 shows an acyclic graph. There are no 
cycles. A cycle would be formed if, in starting at one node and following 
the links, you could arrive back at your starting place. 


Figure 4.19 An acyclic graph 


Figure 4.20 shows a cyclic graph. It contains cycles, e.g. 


Figure 4.20 Acyclic graph 


4-53 


Chapter 4 — Further Programming Techniques Programming Methods 


4-54 


Exercise 4.21 [5 minutes] 


Can you find another cycle in the cyclic graph shown in Figure 4.20? 


Graphs are abstract data types, in that they are defined solely in terms of 
the operations which apply to the objects (i.e. the links between the 
nodes) without commitment as to how the value of each object is to be 
represented. A graph is a very difficult data structure to manage and has 
a number of problems: 


e = It is difficult to establish the number of links there are per element. In 
fact, each element has to have a list of pointers to other elements. 


e Manipulation and traversing are complex. 


In order to implement these structures, a programmer would use arrays 
and linked lists as the storage structure. 


It is possible to present a graph as a two-dimension array where the 
elements in the graph are on the x and y axis and the (x,y) position is 
filled in if the elements are connected. 


Figure 4.21 Array representation of the acyclic graph from Figure 4.19 


Exercise 4.22 [20 minutes] 


Represent the cyclic graph shown in Figure 4.20 as an array. 
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Trees 


Trees are a subset of graphs and are very useful in sorting data. They 
are hierarchical data structures, rather like a family tree. The elements of 
a tree are called nodes. Figure 4.22 shows a tree structure. Note that 
there is a single root element, each element has a set of children and 
each element has a single, unique parent. 


Figure 4.22 A tree structure 


One special type of tree is a binary tree, where all elements have only two 
child elements, but all elements still have a single parent. Note that, 
going down and across the structure, at each node there is either a left 
branch or a right branch, or both. 


Thus each element consists of data and at least two pointers, a left 
pointer and a right pointer. 


Figure 4.23 A binary tree 


Binary trees are a useful data structure for sorting, or keeping a sequence 
in order when additions and deletions are made. 


Assume the numbers 55, 44, 83, 66 and 47 are to be placed in a tree, in 
that order. The binary tree created is shown in Figure 4.24. 
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1 55 is the first element 
2  44<55 so appears on the left 


3 83>55 so appears on the right 


4  66>55 but <83 so appears to the left of 83 


5 47<55 but > 44 so appears to the 
right of 44 


Figure 4.24 Binary tree created from 55, 44, 83, 66 and 47 in that order 


Now add 92, 26 and 72 to the tree. 


1 92> 55, >83 so appears on the 
right of 83. 

2 26<55, <44 so appears to the left of 
44 

3 72>55, <83 and >66 so appears to 
the right of 66 


Figure 4.25 Binary tree after additions have been made 


If 83 is now removed from the tree, the result is shown in Figure 4.26. 


1 83 isa parent node so is replaced 
by the first child (66) plus child 
(72) 


Figure 4.26 Binary tree after deletions have been made 
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Consider that each element has data and two pointers, a left pointer and a 
right pointer. The binary tree in Figure 4.24 can be represented by a 
singly linked list, as shown in the diagram in Figure 4.27. 


Note that the extreme left node contains the smallest number (44) and the 
right hand node contains the highest number (83). 


Figure 4.27 Binary tree represented by a linked list 


Exercise 4.23 [20 minutes] 


Complete the diagram in Figure 4.27, changing the pointers as a node 
is added when the numbers 92, 26 and 72 are added to the sequence. 


Other Tree Pointers 


The left and right pointers will only allow traversing down the structure. 
Two additional pointers, the back pointer and the trace pointer, allow 
additions and deletions to be performed and data to be read in sequence. 


e The back pointer will give the position of the parent of each node. 


e The trace pointer will point to the next node in numerical order for 
each node. 


Using Arrays to Represent Binary Trees 


It is only possible to represent a binary tree as an array when the total 
number of elements in the tree is known beforehand. Note that in this 
case, we would be using the binary tree as the data structure and the 
array as the underlying storage structure. Assume that 100 numbers 
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need to be input, sorted into order and that searching will take place. The 
array record structure will be: 


RECORD results 
HAS FIELDS left_pointer: Integer 
marks: Integer 
right_pointer: Integer 
results_tree[100] ARRAY OF TYPE results 


Figure 4.28 Array record structure for a binary tree 


The original numbers were 55, 44, 83, 66 and 47. The array of records is 
shown in Figure 4.29. The data would be entered into the array in the 
order in which the nodes were added. Therefore node 55 is the first one, 
44 the second one added etc. Note that null is represented by -1. 


Array Record 


ee ee ee 
Number 

se =~ 

7 


Figure 4.29 Binary tree represented by an array 


Summary 


In this chapter we have introduced: 


e Local and global variables. 

e Procedures. 

e Using parameters. 

e Functions. 

e One and two-dimensional arrays. 
e Tables. 

e Sorting and searching. 

e Linked lists. 


e Data structures, including queues and stacks, graphs and trees. 


These are key transferable concepts in programming and can be applied 
in principle to any programming environment. It is important to tackle 
these topics, as they will assist your understanding whenever you learn a 
programming language for the first time. 
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10 Self Study 


This chapter is concerned with problem solving. You need to work through 
the examples and questions in the order presented in the chapter. Use 
the self study notes relating to each section at the time you are studying 
the section. Additional guidance for each section is provided under the 
section headings in these notes. 


When you attempt a question, check your answer before continuing with 
the text. If you have the wrong answer, try and understand where you 
have gone wrong before moving on. Problem solving is concerned as 
much with the process of solving the problem as it is with the correct 
answer. 


10.1 Procedures and Functions 


When you start using your chosen programming language, you should 
spend some time relating the information provided in pseudocode in this 
section, to the procedure/ function structure of your chosen language. 


Now the programs are becoming more complicated, it is worth mentioning 
that these program examples are kept simple to avoid obscuring the new 
ideas with the complexity of the code. Nevertheless, the simple 
mathematics program in the Functions section can be used to form the 
basis of a reasonably challenging piece of coding. You could consider the 
following: 


1. Add validation of the input to the pseudocode provided for the menu 
procedure. 


2. If you try this example when you are using a programming language, 
then clearing the screen and positioning the screen messages are also 
useful additions. 


10.2 Arrays 


Exercise 4.6 is an example of a typical logic problem when using arrays. 
You have to make a logical connection between the array subscripts, the 
numbers you will be multiplying and the controls to be used in a 
FOR...ENDFOR loop. If you found this exercise difficult, it is worth 
practising a little more. 


10.3 Arrays: Sorting and Searching 
It is worth spending some time making sure you understand the tables 


containing the changing data in the array elements and variables, as the 
different passes are made during the sorts. This will help you understand 
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the sorting algorithm and later, to understand how the alternative sorting 
algorithms differ. 


Use the table and the pseudocode together, going through each 
pseudocode statement and noting down where the contents of the 
variables have changed. When you think it makes sense, then continue 
with the text and questions. 


Self Study 1 [2 hours 30 minutes 


The bubble sort mentioned earlier in the text is an example of an 
exchange selection sort and is often used in teaching sorting 


techniques because of its simplicity. However, it is a very inefficient 
sort when the number of elements is more than 30. 


Research and find alternative sorts, e.g. Quicksort, and study how they 
work. 


Linked Lists and Arrays 


Singly Linked Lists — Deletions 


Before deletion After deletion 


next_pointer 


When the current element is deleted: 


previous element 


current element 
(to be deleted) 


next element 


e —_ the next_pointer of the previous element must link to the next element. 
Before deleting the element: 


e make the next_pointer in the previous element := next_pointer in the element to be 
deleted. 


When deleting an element, the two tasks identified above will only be 
carried out if the element is between two existing elements in the 
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sequence. If the element to be deleted is the first element in the 
sequence, then there is no previous element in the sequence. 


Self Study 2 


What happens if the element to be deleted is the last element in the 
sequence? 


Singly Linked Lists — Additions 


Before addition After addition 


current element data | next_pointer 
(to be added) 


previous element 


next element 


When the current element is added: 

e the next_pointer of the previous element must link to the current element; 
e —_ the next_pointer of the current element must link to the next element. 
When adding the element: 


e make the next_pointer in the current element:= the next_pointer in the previous 
element; 


e make the next_pointer in the previous element := the address or subscript of the 
current element. 
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Doubly Linked Deletions 


Before deletion After deletion 


previous] [next 
Kk \ 
previods| [next 
CN 
previous] [next | 


When the current element is deleted: 


previous element 


current element 
(to be deleted) 


next element 


previous] [next | 


e —_ the next_pointer of the previous element must link to the next element; 


e —_ the previous_pointer of the next element must link to the previous element. 
Before deleting the current element: 


e make the previous_pointer in the next element:= the previous_ pointer in the current 
element; 


e make the next_pointer in the previous element := the next_pointer the current 
element; 


° then delete the current element. 


Doubly Linked Additions 


Before addition After addition 


previous] [next previous] next | 
—\ 


(to be added) 
—\ 
pe] a 


e —_ the previous_pointer of the current element must link to the previous element; 


previous element 


next element 


revs) [next | 
When the current element is added: 
e —_ the next_pointer of the current element must link to the next element; 


e the next_pointer of the previous element must link to the current element; 


e the previous_pointer of the next element must link to the current element. 
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Before adding the current element: 


e make the previous_pointer in the current element:= the previous_pointer in the next 
element; 


e make the next_pointer in the current element := the next_pointer the previous 
element; 


e make the previous_pointer in the next element:= the address or subscript of the 
current element; 


e make the next_pointer in the previous element := the address or subscript of the 
current element. 


Self Study 3 


When adding an element, the four tasks identified above will only be 
carried out if the element is inserted into the sequence. If the new 


element becomes the first or last element in the sequence, there will be 
slight differences, in terms of whether there is a previous and next 
element in the sequence. Identify the algorithms for these two special 
cases. 


Summary 


The following problem was addressed earlier in the workbook. You should 
now be able to solve it in a much more professional manner, using 
procedures and functions, sorts and searches. 


A member of the teaching staff at a college requires a program which will 
print the examination results for a group of students. The output details 
required are Student ID code, student name, the percentage results of the 
three examinations, the average percentage mark obtained for each 
student and an indication of whether the student has obtained a grade of 
fail, referral, pass, merit or distinction. 


The rules for the allocation of grades are: Fail is less than 30%, Referral 
between 30% and 39%, Pass between 40% and 59%, Merit between 60% 
and 74% and Distinction is 75% and above. A sample of the output 
required is: 


Student ID | Student Exam1 Exam2 Exam3 Average | Grade 

Code Name 

C09912345 Jenny 35 43 67 48 Pass 
Macintosh 

C09845678 Michael 56 24 32 37 Referral 
Knowles 


Re-visit this problem, but this time use an array of records, enter all the 
data input into the array and for each student, calculate the average and 
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grade which should also be stored in the array of records. Print the report. 
Provide an additional feature where teaching staff can ask for the results 
of a student to be identified and displayed. 

Use the following four procedures: 


1. Enter data and update the array of records with average mark and 
grade. 


2. Print the report. 


3. Sort the records by student_ID. 


4. Search and display where a function is used to search on the student 


ID input by the user, and return the subscript of the element in the 
array. 


Note that you will need to address the problem of using arrays, that is, a 
maximum size will need to be defined and the procedure to enter data will 
need to ask the user how many students’ results will be entered. 
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1 Learning Outcomes 


At the end of this chapter you will: 


e Understand why the Unified Modelling Language (UML) was 
developed. 


e Know how it can be used and understand its limitations. 
e Be able to design a simple system using examples of UML notation. 


e Be able to describe how these notations fit together to make a 
system. 


e Be able to explain how these notations can form part of the system 
documentation. 


2 Introduction 


Programmers have not always used design and analysis techniques in 
software development. Before these methodologies were developed, 
programmers often only relied on rough sketches. Programs were built up 
gradually from small items of code. Whilst this suited many programmers, 
it has serious drawbacks. It is difficult to communicate the nature of the 
program being built, as there is no overall plan to work to. It would be 
difficult for another programmer to take over the development should the 
need arise and, finally, there is no way to be absolutely sure that what is 
being built is actually what the client wants. 


Modern programs are built differently however. Developers use analysis 
and design techniques to determine client requirements. The resulting 
plans allow each member of the team to understand where their own 
particular work fits into the system being developed and exactly what that 
system is. Another important consideration is the possibility of company 
takeover; this could mean a radical change in the software being 
developed. If that software has been planned thoroughly, those changes 
are far simpler to undertake. 


A design methodology offers a means of notation or expressing the 
design in graphical form, and a process, which is a series of steps to 
follow during the software development process. 


Modelling languages, or means of expressing design in graphical form, 
are an important part of software development for a number of reasons: 


e Amodel provides an exact specification for the developers. 


e The blueprint allows the project managers to estimate cost with 
greater accuracy. 
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e UML (Unified Modelling Language) offers a method of communication 
between technical developers and the non-technical users, as it 
allows the developers to understand precisely the user requirements. 
It can also assist communication between developers. 


An exact specification is created during the design process and the 
analysts discuss with the client their requirements and what the software 
should be able to do. By the end of the design process, an agreed 
specification should be produced, identifying a number of key factors. 
These include the classes that the system will use. The section on CRC 
cards (in Chapter 3) explored a way of discovering classes in a system. 


Other key areas to consider are the objects and people that will interact 
with the software, and the scenarios in which interaction takes place. 
Interaction is not restricted to the system and the outside world; there is of 
course interaction between the objects in the system itself. These are all 
considerations that have to be taken into account in the design and 
analysis process. 


An exact specification means that the programmer knows exactly what 
software to develop, and what it is required to do. Clients and developers 
will share a terminology or set of words and concepts that will make 
communication easier. This means that the client will be able to see that 
the system being built is in fact the one that they want. This shared 
terminology is also used between the development team, which helps to 
ensure that they have a mutual understanding and are therefore 
developing the same system. 


Cost is another important factor, although perhaps one that does not 
seem directly to affect the programming team. It is important for the 
project manager to be able to estimate the cost to the client, and to do this 
the project manager must have a good idea of how long the system will 
take to develop. Part of the costs is the time taken to develop the 
software, and it would be impossible to estimate this without a thorough 
understanding of the system. 


3 Why the UML was Developed 


Design methods have been used in industry since the 1970s and 1980s. 
These methods generally consisted of a process and a notation. The 
process suggests a series of steps to follow during development. The 
notation provides a way of expressing designs created in the initial stages 
of development. These early methodologies were created for use in 
structured programming. When object-oriented languages gained 
popularity for the late 1980s, it was felt that the programming world would 
benefit from analysis and design methods being applied to object-oriented 
programs. 


A number of methodologies were subsequently developed, each having 
its own process and notation. Each methodology had its devotees and 
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there was a great deal of competition between the methodologists to 
become the dominant technique. Eventually it became clear that a 
standard was needed for notation. Certain types of diagram were 
common across most methodologies, although they were notated 
differently. This led to some confusion, as people using different 
methodologies might interpret them differently. 


Initially, methodologists were either opposed to a standard, or unwilling to 
conform. Eventually, one of the key methodologists, Jim Rumbaugh, 
decided to collaborate with another methodologist, Grady Booch, at a 
company called Rational Software. Their aim was to merge their methods. 
Together they produced a first version of the merged methods. 


A couple of years later, Rational Software bought a company called 
Objectory, who employed another key methodologist named _lvar 
Jacobson. By 1996, the three merged methodologies had become known 
as the Unified Modelling Language, and the three methodologists were 
known as ‘The Three Amigos’. 


A final stage was needed to achieve an accepted standard, as developers 
were reluctant to accept a standard imposed by a software company. An 
independent task force was therefore created to deal with the process of 
creating official UML standards. This task force was part of the Object 
Management Group (OMG). A public version of the UML was finally 
released in 1999. The activities of the Object Management Group can be 
viewed at www.omg.org. 


4 The Advantages and Disadvantages of the 
UML 


Unified Modelling Language is a /anguage rather than a methodology. 
This means that it offers a means of notation or vocabulary for expressing 
the underlying ideas of the analysis and design. This differs from a 
method, or process, which offers the developer a recommended way to 
perform object-oriented analysis and design. Competition between the 
different methodologies has meant that graphical design techniques have 
not been as widely used. Yet, UML offers a number of advantages to 
developers: 


e The new UML development tools available can now design systems, 
and allow code to be exported. 


e New tools allow for code to be exported as diagrams. This provides a 
method for systems analysts to check that the program will achieve its 
goal. 


e Offers a ready-made modelling language which helps developers 
communicate. 


e Language is graphical and capable of expressing ideas. 
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e Enables the modelling of systems and software using object-oriented 
concepts. 


e Provides a means of extending the core concepts of the language. 


e It is independent of programming languages and development 
processes. 


e Provides a formal basis for understanding the modelling language. 
e Encourages the growth of the OO tools market. 


e Supports higher-level development concepts such as collaborations, 
frameworks, patterns, and components. 


e Integrates best practices. 


e Addresses the issues of scale inherent in complex, mission-critical 
systems. 


e Creates a modelling language usable by both humans and machines. 
e Its good upfront design will shorten development time. 


e Astandard model means easier communication between development 
teams. 


e = It is an official standard. 


There are, however, some disadvantages to UML. 


e = The initial learning curve can be steep. 


e Modelling is a discipline, and therefore needs to be used. A tool will 
not model the system for you. 


e Independence of programming languages and processes may not suit 
some people. 


e Developers will probably still need an iterative development process. 


5 _ Basic Notations and their Relationship to the 
Software Development Process 


There are numerous types of diagram in the Unified Modelling Language, 
which relate to a number of different areas of design and analysis. This 
section provides a brief introduction to the various styles of diagramming 
to be found in UML, with simple examples given. The purpose is to make 
you aware of the breadth of UML. Elaborate and more detailed examples 
of several of these techniques are given later in the chapter. The types of 
diagram are listed below: 


e Use case diagrams — for providing system overview from a user 
perspective. 


e Class diagrams — for defining system objects and their relationships. 
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e Sequence diagrams — for illustrating the sequence of events and 
changes over time. 


e Component diagrams — for defining software system components e.g. 
file, table. 


e Deployment diagrams -— for showing the physical layout of 
components on hardware nodes. 


e Statechart diagrams - for illustrating an object’s state/s e.g. switch on 
or off. 


e Collaboration diagrams — for showing how the elements of the system 
being developed work together to achieve the system’s purpose. 


Use Case Diagram 


Use case diagrams concentrate on the user’s point of view (the point of 
view of the person using the system). To be able to create software which 
is easy to use, it is necessary to consider the system from the end user’s 
point of view. If the end user is unlikely to be a computer expert, it is 
important to take this into account. Use cases should explore scenarios 
which aim to satisfy the needs of those using the system. 


withdraw money 


customer 


Figure 5.1 A ‘withdraw money’ use case 


This figure shows the ‘actor’, or thing that is interacting with the system, 
on the left. The actor is connected to the use case on the right. The use 
case is what the actor wants to achieve. 


This scenario entails following these steps: 


e Customer puts card into machine, and enters PIN number. 


e The machine reads the card, and checks the PIN number in its 
database. 


e The PIN is confirmed. 


e The machine asks the customer which service they want: withdraw 
money, deposit money, or check account balance. 


e The customer chooses to withdraw money, and specifies how much. 
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e The machine checks that the customer has enough money in their 
account. 


e The customer has enough funds, so the machine returns the 
customer’s card. 


e The money is dispensed through the slot on the front of the machine. 


Class Diagrams 


Class diagrams express the classes of objects which are found in the 
system. Classes share attributes and behaviours, for instance the bird 
class shares attributes such as feathers and beaks, and behaviours such 
as the ability to fly and lay eggs. 


feathers 


beak 
wings 


can fly 


lays eggs 
sings 


Figure 5.2 The Bird class represented in a class diagram 


The diagram is divided into three parts. The name of the class is at the 
top of the diagram, the attributes are in the centre and the class 
behaviours are in the bottom section. 


Sequence Diagrams 


Sequence diagrams show how objects in the system will interact with 
each other. An object is an instance of a class, that is, it is an occurrence 
of a particular class. A parrot is an instance of the bird class, just as a 
teacher is an instance of the human being class. Sequence diagrams 
differ from some of the other diagrams such as class diagrams, because 
they are dynamic. A class diagram does not show any interaction, 
whereas the sequence diagram allows time-based interactions to take 
place. 


If we look at the example of a microwave oven, we can explore the 
sequence diagram a bit further. A microwave has a number of 
components, including a timer, heating element and a turntable. The use 
case for cooking food in the microwave involves these steps: 

e The timer is set for a number of minutes. 


e The turntable rotates. 
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e The heating element is activated to cook the food. 
e The timer reaches the end of its set time. 
e The turntable stops rotating. 


e The heating element is turned off. 


[aie | [ tua 


| 

Set time | 
> | 

| 


| 
Press ‘Start! Turn on turhtable 


Turn on heating element 


Time 
15 minutes Keep heating 


Turn off heating element 


Turn off turntable 


Figure 5.3 A basic sequence diagram for a microwave oven 


Component Diagrams 


Component diagrams, along with deployment diagrams, are solely 
concerned with the development of computer systems. Modern 
development processes use components, which is an important aspect in 
projects that rely on a team for development. A component can be a table, 
data file, executable file and many other things. If a class represents an 
abstract form of a set of attributes and behaviours, for example a bird 
class or a mammal class, the component diagram is a_ software 
implementation of that class, e.g. bird.exe (a software executable of the 
bird class). 


Figure 5.4 A simple component diagram showing the software implementation 
of the bird class 
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Deployment Diagrams 


Deployment diagrams deal with the hardware aspects of system design. A 
software program is likely to be used all over the world and on many 
different types of computer platform. Deployment diagrams show the 
physical set-up of the system. These diagrams can show the type of 
computers used, the devices they have, and the software which is 
available on each machine. 


Figure 5.5 shows a simple deployment diagram of a computer processor 
and the monitor attached. The main difference between the processor and 
the monitor is that the processor is able to execute components, unlike 
the monitor (the device). The monitor, however, may well have an 
interface with the outside world. 


<<Processor>> <<Device>> 


PC Monitor 
AMD k6 300 ADI Microscan 4V 


Figure 5.5 A simple deployment diagram showing a processor and monitor 


Statechart Diagrams 


Objects in the real world are usually in one state or another. This applies 
to people as well as things. Lights can be on or off, a car can move 
forward, backward, or stop. A human can be a baby, a child, married or a 
student. The statechart diagram in Figure 5.6 shows the different states of 
a traffic light sequence. The diagram shows the light’s transitions from 
one state to another, with the starting state symbol at the top and the end 
state symbol at the bottom. 


Collaboration Diagrams 


Collaboration diagrams show how the elements of the system being 
developed collaborate or work together to achieve the system’s purpose. 
As this is such a vital aspect of the system itself, modelling languages 
must have some way of expressing these relationships. These diagrams 
are basically the same as sequence diagrams, except that they 
emphasise the information in different ways. A sequence diagram is 
concerned with time, whilst a collaboration diagram concentrates on the 
overall organisation of the interacting objects. The collaboration diagram 
can be thought of as dealing with the objects from a spatial point of view. 
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If we revisit the microwave example from the sequence diagram, we can 
see that a collaboration diagram (Figure 5.7) shows much more clearly 
the components which are used in the system. 


RED 
to go AMBER 
GREEN 
AMBER 
RED 


Figure 5.6 A statechart diagram showing a traffic light sequence 


Figure 5.7 A collaboration diagram showing the microwave example as a 
collaboration diagram 


The dedicated sections which follow will explore these diagram types in 
more depth. 
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6 Use Cases 


6.1 Introduction and When to Use 


Systems are often developed according to clients’ and developers’ 
opinions, with little thought given to the users of the system. Systems that 
are built to take into account potential users are usually more beneficial to 
the end user. This is an important consideration, as it is the end user who 
will be the main consumer of your system, rather than you, the developer, 
or the client who has commissioned the system. A system that is 
designed for its end users will be easy to use, rather than difficult and 
frustrating. 


Use cases are extremely helpful for this sort of analysis, as end users 
may find it difficult to articulate their view of the system. The use case 
offers a means of explanation and communication between user and 
developer. These discussions lead to the exploration and identification of 
the scenarios which form part of the system requirement. 


Use cases offer a dynamic view of the system from the point of view of 
those who will be using the finished system. Use case diagrams show 
how the system and its classes will change over time. Whereas the static 
view of the class diagram aids communication with clients, the dynamics 
of the use case are important in communicating with the development 
team. If the development team have a thorough understanding of what the 
system is supposed to achieve, they can create that program. The 
number of use cases created varies according to the system being 
developed, and also the modeller’s preferences. 


To understand how the system should interact with the outside world, it is 
necessary to identify the scenarios which will occur. If you are designing 
an online shop, there are several scenarios which will occur as a matter of 
course; a customer may return to place another order, a customer may 
wish to change their order, check its details, or change their account 
details, such as address or credit card number. A new customer will need 
a new record created for them, whereas a returning customer will already 
be in the database. Your shop may decide to offer discounts to loyal 
customers, or have special promotions. 


The purpose of a use case is to examine the scenarios that will benefit an 
‘actor’, meaning an object which is not part of the system but that interacts 
with it. This could be a human being such as a customer, a piece of 
computer hardware or simply the passage of time itself. A scenario 
describes a sequence of steps which are initiated by an actor, although 
the beneficiary can be another actor rather than the one initiating the use 
case. An example of this would be telephone companies’ call charges. 
During certain hours of the day, telephone calls may be cheaper than at 
other times. Calls made during business hours may be more expensive 
than those made out of business hours. 
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Whoever the actors for your system are, it is important to identify them 
during the analysis stage of software development. A number of questions 
need to be asked to determine exactly what they need the system to do. 
This is the purpose of use case analysis. 


Use case analysis begins with the client interviews which take place to 
determine the initial classes in the system. Once a set of initial class 
diagrams has been created, you will have a basis for discussing possible 
use cases. The analyst's role is then to ask the users to describe all their 
interactions with the system, and to describe each use case i.e. the 
scenarios that will occur. 


It is also important to determine all the actors who will initiate and benefit 
from the system. Analysts can ask questions such as: 

e What exactly do the users want the system to do? 

e What abilities must the system have? 

e Who will be using the system? 

Use cases are not confined however, to the analysis phase, and they can 
be used in both the development and testing phases. 


Use case analysis aims to describe how the system will behave rather 
than how it will be implemented. A use case model defines the boundaries 
between the system and the outside world, which is where the system 
ends and the real world begins. A use case model also defines where the 
system interacts with the outside world. 


Study Note 


Before use case analysis can begin, it is important to gain a thorough 
understanding of exactly what the client wants. This is known as 


requirements analysis, and involves determining the business 
processes the client uses and the scope of the client’s domain. 
Through this, the analyst gains an understanding of both the user and 
the domain terminology. 


Domain terminologies are those conceptual terms used by clients for their 
particular domain or ‘world’. There are many examples in the real world, 
such as sport-related terms and ideas, or concepts and terms used in 
computing. 


User terminologies are those words or terms that form the vocabulary 
relating to the system being developed. These are used commonly by the 
system’s users, and may be specific to their jobs or roles. These early 
interviews provide: 

e class diagrams to help define the objects in the system, and 


e activity diagrams to represent the business processes. 
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A high-level set of use cases can also be identified during the analysis of 
the client’s domain. 


Use case analysis in concerned with expanding on the high-level use 
cases identified in the domain analysis. This will progress the 
development team’s understanding towards an understanding of the 
system needed, rather than the domain. 


connect 
System externally 


Administrator 


provide 
security levels 


Clerical 
Staff 


Project 
Manager 


M: keting ee matey 
M: ager ocumentation 


connect to 
internet 


Sales 
Representative 


Figure 5.8 A use case analysis example 
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Figure 5.8 shows some high-level use cases that would be produced 
during the analysis phase. These use cases are scenarios which occur in 
the working life of a Local Area Network (LAN), and are applicable to all 
the staff members listed alongside the diagram. The use cases shown are 
packaged together in a simple diagram grouping the information. 


Exercise 5.1 [30 minutes] 


Draw a use case diagram for a CD player remote control. Each 
function of the remote control will be a use case for your model. 


Exercise 5.2 [30 minutes] 


Consider an online shop which sells books, records and videos. What 
major use cases would that online shop involve? 


Scenarios 


Scenarios are an important method of identifying use cases. A scenario is 
a succession of steps depicting an interaction between a user and a 
system. A use case is formed from a collection of scenarios, and each 
scenario included, describes a sequence of events. These events are 
linked together by a common user objective, such as buying a product 
from an online shop. The use case could show the successful purchase of 
a product, along with alternative scenarios such as the product being out 
of stock. These alternative scenarios are used to augment the use case. 


A typical scenario for an online shop could be as follows: 

e The customer browses the online catalogue and selects the items that 
he/she would like to purchase. 

e These items are added to the shopping basket. 


e When the customer has finished browsing and wishes to pay, the 
customer then enters credit card details and decides how the goods 
should be delivered. 


e The system checks the authorisation on the credit card and confirms 
the sale both immediately and by email. 


Exercise 5.3 [20 minutes] 


Think about how the online shop scenario would differ if the purchaser 
were already known to the system. How would the system deal with the 
customer? 


Write down the scenarios which may occur in this use case. 
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Actors 


As a use Case is created from a user-centred viewpoint, it is important to 
consider the actors that will interact with the system. Actors can be 
anything from a human being, another system, hardware or the passage 
of time itself. All these things interact with the application, but are external 
to it. Each actor is a role that the user plays with respect to the system, 
rather than an individual person. 


Example 


For example, each person plays the role of customer in an online shop, 
although they may play other roles in the system as well. Actors are 
responsible for carrying out use cases; actors can play many roles and 
use cases can have many actors. 


Notation for Actors, Use Cases and System Boundaries 


The notation for actors is a stick figure. Actors are shown with the 
initiators on the left of the use case, and receivers on the right of the use 
case. The actor’s name is written just below the stick figure. Actors are 
joined to use cases through an association line, a solid line which 
connects an actor to a use case. Association lines represent 
communication between actors and use cases. The use case is shown as 
an oval with the use case name inside or below it. Use cases are usually 
shown within a system boundary, which is a rectangle with the name of 
the system written inside it. 


Figure 5.9 Notation for use case, system boundary and actors 


ACTOR 


SYSTEM BOUNDARY 


Exercise 5.4 [15 minutes] 


Are there any other actors interacting with an online shop that you can 
think of? 
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Exercise 5.5 [15 minutes] 


You are going to install a LAN for a local company. The company has a 
large number of staff, and you need to determine the actors that will 
interact with the system. 


Write down a list of all the actors you can think of. 


Relationships 


Relationships may exist between use cases themselves, as well as 
between actors and use cases. There are four types of use case to use 
case relationship: 


e inclusion; 
e generalisation; 
e extension; 


e grouping. 


Inclusion Use Case 


If you find that some behaviours are being repeated in two or more 
sections, the UML provides a solution to avoid repetition. Inclusion allows 
developers to create use cases that can simply be referred to each time 
they are needed, rather than laborious rewriting or a lot of cut and pasting. 
Aside from the time saved, another advantage of this approach is that the 
developer can make changes to one particular use case. If the behaviour 
had been repeatedly rewritten, the amendments would also need to be 
rewritten by hand for each occurrence. Inclusion use cases are dependent 
on the use case that contains them. They therefore cannot be used on 
their own, and must be used in conjunction with the use case that 
includes them. 


Example 


You may find that if you were designing a drinks machine, you would 
need to follow a distinct sequence of events every time you needed to 
restock or service the machine. The following steps could be used to 
create an open machine use case, which would form the basis of other 
use cases, such as a refill machine. 


Open machine inclusion use case. 


e Unlock the machine. 
e Enter a security code. 


e Open the door. 
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RESTOCK THE <<include>> 


={ OPEN MACHINE 


<<include>> 


<<extend>> 


RESTOCK ACCORDING 
TO BEST SELLERS 


A CASE DIAGRAM SHOWING EXTENSION AND INCLUSION 


Figure 5.10 Inclusion and extension example 


Notation for Stereotypes 


Definition: Stereotype 


Label attachments that add extra classification to model items. Some 
stereotypes are predefined and are automatically available e.g. 
<<interface>>, but you can define your own to add whatever extra 
classification is useful. 


For an inclusion, the dependency line points to the included use case 
from the main use case. For an extension, the dependency line points 
from the extension to the base use case. To show inclusion the 
stereotype symbol is used with the word ‘include’ or ‘exclude’ inserted 
between the << >>. 


<<STEREOTYPE>> 


Figure 5.11 Stereotype symbol 


Exercise 5.6 [15 minutes] 
What use cases are likely to occur more than once for an online shop? 


Exercise 5.7 [15 minutes] 


Draw a diagram showing the use case for accessing account details. 
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Generalisation Use Case 


Use cases can inherit behaviour and meaning from parents in the same 
way that classes can. This type of modification is known as 
generalisation. 


aa: 


CHILD CHILD 


GENERALISATION BETWEEN ACTORS THE SOLID LINE 
AND OPEN TRIANGLE POINT TO THE PARENT USE CASE 


Figure 5.12 Generalisation example 


Example 


For instance, you may already have a use case for buying a drink that you 
could add behaviours to. The buy a drink use case could then be used as 
the parent for to buy a glass of drink. The generalisation could then add 
steps for adding mixers, ice and lemon in a child use case. 


Steps from buy drink use case: 


e Add tonic or bitter lemon 
e Addice 


e Add lemon 


GraX 


Figure 5.13 Generalisation example 


Exercise 5.8 [15 minutes] 


Your online shop has decided to sell shoes. You already have a place 


an order use case for your books and records. 


What behaviours could you add to your parent use case? 
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Exercise 5.9 [15 minutes] 


A local company is going to install a LAN. 


Write down a list of all the actors you can think of for the LAN. Some of 
the staff will share certain privileges. Create a diagram showing 
generalisation between these members of staff. The diagram should 
show a hierarchical structure with the generic employee at the top. 


Extension Use Case 


You may find that you need to make additions to your standardised or 
base use case. The base use case can therefore be extended to form the 
basis of new use cases. The person who maintains the machine may find 
that there are certain types of drink that are most commonly bought. The 
supplier may decide to increase the number of cans of these brands, and 
no longer stock the less popular brands. The use case for restocking the 
machine could be extended to allow for restocking according to demand. 


Extension use cases only occur at specified points within the base use 
case’s sequence. The points where extensions occur are called extension 
points. 

Example of Base Use Case 

Browse items online. 

Make selection and add to shopping basket. 

Go to checkout. 

Log on to account. 

Verify account details. 


Po! Br ot 


Verify items in shopping basket. 
Final confirmation. 


> © 


Place order. 


Example of Extension Use Case 


Someone wishing to buy items from an online shop must log on to the 
system, and provide their account details. These will link the billing and 
delivery details for the customer with the items in the shopping basket. 
New customers however, will not have an existing account with the 
company, and will need to create one before they can complete their 
order. The extension point occurs at log on to account. 


Enter name. 
Enter address and telephone number. 


Enter email address. 


a2 9 5 ® 


Enter password. 
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e. Confirm password. 
f. Enter payment details. 


Figure 5.14 shows this extension as a use case diagram. 


<<INCLUDE>> 
fac Piz trametny Anes LOG ONTO 
ACCOUNT, 


<<EXTEND>> 


CREATE ACCOUNT 
FOR NEW USER 


AUSE CASE DIAGRAM SHOWING EXTENSION 


Figure 5.14 Example extension use case diagram 


Exercise 5.10 [15 minutes] 


A company has a LAN, and each system user has access to the 
network. The network engineer should be able to perform more 
activities on the LAN than other employees. Write a basic access 
network use case, and then extend that use case to allow the network 
engineer a greater choice of activity. 


Exercise 5.11 [15 minutes] 


Your online shop decides to offer a £5 discount on a video for existing 
customers. The system deducts this amount from the final bill at the 
checkout when the customer enters a special code for the discount. 


Draw a diagram to show how the base use case for the checkout would 
be extended. 


Exercise 5.12 [15 minutes] 


Write down the scenario for deducting the discount from the customer’s 
bill. 
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Grouping Use Cases 


The developer may find it helpful to organise use cases, if there are 
several in existence. It may be that the system is comprised of several 
subsystems, or the client is being interviewed about the system 
requirements. Each system requirement would be a discrete use case, 
and it may prove useful to be able to organise and categorise these 
requirements. 


Grouping Notation 


The notation for grouping use cases is to simply include them inside a 
package, as shown in Figure 5.16. 


PACKAGE 


Figure 5.15 The notation for a package diagram 


customer 


browse 
catalogue 


log onto 
system 


add to shopping 
basket 


check order 


status 
buy goods 


An example of a customer package 


Figure 5.16 Grouping example showing some of the account details needed for 
an online shop 


Exercise 5.13 [15 minutes] 


Identify some of the main use cases for the customer package for your 
online shop. Draw a diagram to illustrate them. 
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Consider some of the main use cases a user may require from the 
company LAN. Draw a package diagram to illustrate them. 


6.3 Diagrams and Other Methods 


Use cases do not necessarily have to be developed using UML 
diagramming techniques. Developers may find it useful to record 
individual use cases on index cards, and then arrange the cards to 
establish what needs to be built for each iteration. However, for those 
who prefer to use graphical representation and modelling tools, Figures 
5.17 to 5.24 show a summary of the symbols discussed throughout the 
text. 


Figure 5.17 The notation for an actor 


Figure 5.18 The notation for a use case 


——_> 


GENERALISATION 


Figure 5.19 The notation for a generalisation 


ASSOCIATION 


Figure 5.20 The notation for association 


SYSTEM BOUNDARY 


Figure 521 The notation for system boundary 
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Figure 5.22 The notation for dependency 


<<STEREOTYPE>> 


Figure 5 23 The notation for stereotype 


PACKAGE 


Figure 5.24 The notation for a package 


6.4 Business and System Use Cases 


These two types of use case are concerned with two different aspects of 
system design. Business use cases are concerned with how a business 
responds to a particular customer or event, whereas system use cases 
are concerned with the interaction between the system or software and 
the user. 


A business use case can be useful when considering how to meet an 
actor’s goal, for example, how best to allow the customer to buy the 
goods they desire. Sometimes it is more productive to change the 
business practice to a solution to a problem, rather than try to change the 
system to accommodate the problem. 


A system use case is more useful for planning the system, and is 
concerned with how the system behaves in particular scenarios. For each 
business use case that has been identified, there should follow a set of 
system use cases. 


7 Class Diagrams 


7.1 Introduction and When to Use 
The Unified Modelling Language has a number of techniques for 
capturing in a visual form, on the page or screen, the variety of aspects 
you need to consider to define an object-oriented system well. 


You have already been introduced to use cases, a way of presenting real 
life systems and the interactions between them. Use case diagrams 
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convey high-level views of information systems and allow project 
personnel to gain an overall appreciation of the system, but use cases do 
not specify the objects in the system at the level needed to create code. 
This is what class diagrams do. 


Class diagrams let you create a visual map of the objects which make up 
the system you are planning to build. Class diagrams are central to 
object-oriented methods, are at the core of UML and are widely used in 
practice, so you will come across them frequently. This is because class 
diagrams neatly define the code you will need to implement the system. 


A class diagram describes the types of object in the system and the 
various kinds of static relationship existing among them. 


Types of object may include: 


e physical entities e.g. buildings, engines and clocks; 
e logical entities e.g. employee records, invoices and speed; 
e soft entities e.g. tokens, expressions or data streams; 


e conceptual entities e.g. needs, requirements or constraints. 


Class diagrams are static. They do not show change, whether it be in 
time, state or order, etc. In UML, changes are captured in a range of 
dynamic modelling techniques such as: 

e = activity diagrams; 

e collaboration diagrams; 

e sequence diagrams; 

e state diagrams. 

You will meet these diagram types later in this chapter. 


There are two principal kinds of static relationship: 


e subtypes, which define relationships between similar objects (a bus is 
a kind of vehicle); 


e associations, which define relationships between different objects (a 
customer may rent a number of videos). 


Subtypes are about inheritance, e.g. parent and child relationships. 


You have already encountered these relationships in Chapter 1. 
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Building Class Diagrams 
To build class diagrams you need to: 


e define the objects in your proposed system; 

e group them according to the classes (categories) into which they fall: 
e determine parent classes; 

e define the attributes and operations for each object; 


e define the associations between objects and the messages sent to 
the associated objects, by further sorting and rearranging the objects. 


Class diagram notation, the symbols you use for this purpose, defines 
how these items and concepts are represented. 


The early steps are straightforward. The later steps become more 
complex because you need to have grasped how the system will work 
and what it will need to do. Your objective is to identify enough information 
to let you code your system. The process is as follows: 


Step 1 
Define the domain for analysis, usually via interviews with the client. From 
the interviews, identify an initial set of objects. 


Let us take the idea of constructing systems for the running of an airport. 
Here is a brief scenario: passengers enter the terminal, queue at the 
check-in desk, show their passports and tickets to the check-in attendant, 
receive a boarding pass if all is well, hand over their luggage to the airline, 
go through customs, do some shopping before their flights are called, 
board the plane, take their seats and wait for take-off. 


From these details it is possible to define an opening set of objects, a 
number of which are given in Figure 5.25. This is your starting point. 


CHECK-IN FLIGHT 
Peosehers STAFF ATTENDANT PILOT 
BOARDING 
PASSPORT TICKET LUGGAGE 


Figure 5.25 Class diagram showing airport objects (ungrouped) 


Step 2 


From this initial set of objects, try to form some meaningful groups of 
objects which are related. 
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Step 3 


Next, define and label the associations between some of the classes. A 
useful strategy is to focus on a few classes and see how they are 
associated with one another. Then move on to another group until the set 
of classes is exhausted. 


Step 4 


At this point, focus on the details required for an individual object, and add 
some attributes and operations. 


Now you will have in place diagrams showing objects arranged by 
subtypes and associations, and with attributes and operations defined. If 
you follow this process closely it is, in theory, only a small step to map this 
information onto the lines of code you need to produce for the application. 
At this point you are nearest to defining the messages and methods 
needed to create system interaction. 


Continuing with the airport example, steps 2, 3 and 4 will be looked at in 
greater detail in the sections to follow. 


Subtypes and Generalisation 


So far, you are aware that a system is made up of objects. Objects must 
work together to produce an interactive system, so they need to 
communicate with one another (remember the idea of messages) and 
they can only do that if the links between them are defined, to enable 
them to receive and send messages correctly. 


Subtypes are a form of linkage where individual objects share common 
features with other objects and may be said to be examples of “a kind of ” 
something, e.g. kind of animal. The test for inclusion in a subclass is to 
say that an object “is a’ member of that class, e.g. a lion “is an” animal. 
These individual objects of the same type are linked hierarchically to an 
abstract superclass (in this case “animal”). 


Continuing with our airport modelling exercise, the next step is to group 
objects according to their similarities. 


One group that stands out among our airport objects comprises passport, 
ticket and boarding pass. It is reasonable to say these are “a kind of” 
travel document. It is correct to say a passport “is a” travel document. The 
result of these decisions and tests is that we can have a superclass of 
travel documents with subtypes passport, ticket and boarding pass. 
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PASSPORT 


Figure 5.26 Class diagram showing objects (grouped) — with abstract class 


It can be seen from Figure 5.26 that there is a parent-child relationship, so 
the subtypes can inherit attributes and operations from the superclass, 
because they will be common to all instances of type “travel document”. 
Object orientation refers to inheritance; the UML refers to generalisation. 
This means that one class (the child class or subclass) can inherit 
attributes and operations from another class (the parent class or 
superclass). The parent class is more general than the child class. 


In UML, inheritance is represented by a line which connects the parent 
class to the child class. On the part of the line which connects to the 
parent class, you put an open triangle pointing to the parent class. 


Associations 


Step 3 involves defining associations. Associations represent relationships 
between different, not similar, object classes (for example, a person works 
in a company; a company has a number of offices). 


Associations are needed to enable objects to communicate with each 
other. An association describes a connection between classes. The 
concrete relation between two objects of different classes is called an 
object connection or link. Links are said to be instances of an association. 


Usually, an association is a relation between two different classes. In the 
main, however, an association may also be of a recursive nature; in this 
case, a class has a relation with itself. 


Definition: Recursion 


A programming method in which a routine calls itself. Recursion is an 


extremely powerful concept, but it can strain a computer's memory 
resources. Some programming languages, such as LISP and 
PROLOG, are specifically designed to use recursive methods. 
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Each association has two association ends; each end is attached to one 
of the classes in the association. An end can be explicitly named with a 
label, called a role name. The addition of arrowheads to the association’s 
lines indicates navigability, the direction or directions in which messages 
flow. Navigation may be either unidirectional (moves in one direction only) 
or bi-directional (moves in both directions between objects). 


In our airport example, passengers have an association with the attendant 
at the airline’s check-in desk. They show their travel documents to the 
attendant who checks they are correct. This is the relationship featured in 
Figure 5.27. 


SHOW DOCUMENTS CHECK-IN 
PASSENGER ATT 


Figure 5.27 Class diagram showing association 


Note: Adding a link from the attendant to the passenger to describe the 
handing over of boarding passes, could extend this diagram. This would 
then show a bi-directional relationship. 


An association end also has multiplicity, which is an indication of how 
many objects may participate in the given relationship. In general, the 
multiplicity indicates upper and lower bounds for the participating objects. 
If the minimum is 0, the relation is optional. 


In our airport exercise, the association between a flight attendant and on 
board passengers is one to many, as indicated in Figure 5.28. 


FLIGHT )1 SERVES * 
ATTENDANT PASSENGERS 


Figure 5.28 Class diagram showing multiplicity 


There are several ways of naming associations, such as using a verb 
phrase so that the relationship can be used in a sentence, or a noun to 
name the role of one or other of the ends. Sometimes all associations are 
named, and other times only when understanding is improved. 


There are special variations of an association — aggregation and 
composition. 


Sometimes the class consists of a number of component classes. This is 
a special type of relationship called an aggregation. The components of 
the class they constitute are in a part-whole association. In our example, 
the airport is a whole made up of parts including terminal, car park, air 
traffic control tower, etc. 
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ATR TRAFFIC 
CONTROL 


Figure 5.29 Class diagram showing aggregation 


You represent an aggregation as a hierarchy with the whole class at the 
top and the components below. A line joins the whole to the component 
via an open or unfilled diamond below the whole. In an aggregation, it is 
not necessarily the case that each component belongs to one whole. For 
example, in a home entertainment system, a remote control may be a 
component of a television, and the same remote control could be a 
component of the videocassette recorder. 


Composition is a strict form of aggregation in which the part’s existence is 
dependent on the entirety. If the entirety is deleted, so are the instances. 
If an instance is deleted, the entirety remains. For example, consider that 
in a game of electronic chess the playing area is made up of classes 
‘square’ and ‘board’. Each square is part of exactly one board. It would 
not be sensible to copy or delete the ‘board’ object without copying or 
deleting the ‘square’ objects. 


Constraints may be used to restrict the relation under specific aspects. 
For example, in our airport diagram below, the association between the 
booking system and long haul jet is constrained by the maximum number 
of passengers the plane can carry. You can see this is important 
information for the programmer, who can build that constraint into the 
software. Constraints are usually placed between brackets in a diagram, 
but there is a formal language called the Object Constraint Language 
(OCL), which can be used if desired. 


LONG-HAUL BOOKING 
JET (MAX PASSENGERS: 450) SYSTEM 


Figure 5.30 Class diagram showing constraints 
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Attributes and Operations 


Step 4 in the class diagram process involves taking the classes and 
objects defined so far, and assigning to them attributes (things the objects 
know about) and operations (things objects do). This requires you to work 
out these aspects and then to record them in a clear manner to guide the 
programmer and allow communication with other parties within a 
development project. 


Attributes and operations are recorded in class icons, drawn as shown in 


attribute1 
nextAttribute 


Figure 5.31. 


operation1 () 
nextOperation () 


Figure 5.31 Class icons showing attributes and operations 


Class icons are divided into three panels. 


e The top panel carries the class or object instance name. 
e The middle panel describes attributes. 


e The bottom panel describes operations. 


A one-word attribute is written in lower case letters. If the name consists 
of more than one word, the words are linked, and each word other than 
the first word begins with an upper case letter. The convention is the 
same for operations, so single words begin with lower case letters and 
combined words are capitalised after the first word. 


Attributes 


The attributes of a class describe the data contained in an object of the 
class. An attribute has a name and a type. For example, each object of 
class ‘book’ will have a title, which is a string, so the name of the attribute 
is title and the data type is a string. This level of detail is optional — it 
depends on what you think is required within the diagram, but it would 
look like Figure 5.32. 
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title: String 
nextAttribute 


Figure 5.32 Example of book class icon showing attribute information 


Apart from string, other data types may include floating-point number, 
integer and Boolean. 


Attribute is a property of class. It describes a range of values the property 
may hold in objects (that is, instances) of that class. Every object of the 
class has a specific value for every attribute. This can be shown as in 
Figure 5.33. 


title: String = "UML for Beginners" 
nextAttribute 


operation1 () 
nextOperation () 


Figure 5.33 Example of book class icon showing attribute and operations 
information 


Study Note 


It may be helpful when learning about attributes, to think of them as 
similar to records in a database. Records have field names and fields 


for data entry and specify the type of data allowed in a given field. If 
you are familiar with the idea of database records, you may find this 
helps you to understand object attributes. 


Operations 


Operations are the processes that a class knows how to carry out. The 
operations of a class define the ways in which objects may interact. When 
one object sends a message to another, it is asking the receiver to 
perform an operation. The receiver will invoke a method to perform the 
operation; the sender does not know which method will be invoked, as 
there may be many methods implementing the same operation at different 
levels of the inheritance hierarchy. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 5 — Modelling Objects 


V1.1 


An operation has a name and may take one or more parameters and 
return a value. This structure is known as the signature of an operation. 


The UML syntax for operations is: 
Visibility name (parameter-list): return-type-expression {property-string} 


When searching for the operations an object should perform, you should 
look for operations implied by the responsibilities documented during 
analysis. 


Returning to our airport scenario, we need to assign attributes and 
operations to some of the objects identified so far. If we take the check-in 
attendant as an example, initially we may identify the attributes and 
operations in Figure 5.34 as being relevant. 


<<id info>> 
name 

address 
employeeNumber 
jobTitle 


checkPassport() 
checkTicket() 
assignSeat() 
takeLuggage() 
giveBoardingpass() 


Figure 5.34 Example of Check in attendant class icon showing attribute and 
operations information 


It is likely these attributes and operations will be necessary for inclusion in 
an airline system. Note that this is a first pass, and on reflection there may 
be a case to remove, change or add to this current set of attributes and 
operations. This is fine; no-one is right first time and the approach allows 
for steps to be revisited. 
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This exercise covers the four steps of the class diagram process. You 
can divide each step into a 30 minute exercise if you wish. 


1. Only some of the objects in the airport scenario have been identified. 
Can you think through the scenario and introduce other objects? 
Remember, it helps to think through a sequence of events and to 
decide who or what is involved. Focus in particular on what happens 
when a passenger passes through security. Which objects can you 
identify for that scenario? 


2. Continuing with the airport scenario, other than travel documents, 
which other groupings of objects can you identify? Consider: 


(a) the people involved at an airport — can you identify them? 


(b) metal objects from our security scenario — which metal objects 
might be carried by passengers? 


. Which associations of objects can you think of in the airport 


scenario? In particular, which objects can you associate with 
passenger? 


4. You have the attributes defined for the check-in attendant. Now carry 
out the process of assigning attributes and operations to the 
passenger class. Think of as many relevant attributes as you can: 
do the same with operations. Use the icon structure described 
previously. Keep to the naming conventions. 


You may add as much detail as you see fit. 


8 Interaction Diagrams 


8.1 Introduction 


Interaction diagrams concentrate on the messages (or events) which flow 
between objects. There are two types of interaction diagram: 


e Sequence diagrams. 
e Collaboration diagrams. 


A sequence diagram shows the relationship between specific objects and 
allows you to study how a set of objects interacts in time. Sequence 
diagrams do not need to be used for every set of objects which interact, 
but if the developer is unsure about how the objects will interact, the 
sequence diagram will help to clarify this. A sequence diagram should 
correspond to a specific scenario and developers compare them to class 
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diagrams to see how they correspond. Sometimes, developers create 
sequence diagrams after developing use cases and scenarios and then 
use the result to prepare the class diagram. However, it can be useful to 
prepare a basic class diagram, develop the sequence diagram and refer 
back for further class diagram construction. 


A collaboration diagram is a cross between an object diagram and a 
sequence diagram. The objects from a scenario are shown as in an object 
diagram but, instead of showing lines which link objects, arrows are 
inserted and numbered to show a specific sequence of events. By 
following the numbered arrows, the movement of messages can be 
followed, thus, like a sequence diagram, one collaboration diagram 
describes one specific scenario. 


8.2 Sequence Diagrams 


State diagrams in the UML exist to be able to express how objects ina 
system can change over a period of time from one state into another 
state. They concentrate on information about each object state and the 
means by which they change, but the key missing information is that 
concerning time. In other words, how do multiple objects communicate 
with each other over a period of time and in what sequence? Sequence 
diagrams fulfil that function and have been developed to encapsulate the 
interaction between objects and show this information in diagrammatic 
form. 


Think of a Sequence diagram as being laid out in graphical form with 
objects laid out horizontally, left to right, and time laid out vertically, from 
top to bottom. Sequence diagrams start at the top left with objects laid out 
left to right, and each object’s time-based information laid out vertically, 
downwards along its lifeline. This is represented diagrammatically as 
shown in Figure 5.35. 


Object” . 


— Simple 


_ Synchronous 
~ ‘message 
Activation 
Message Symbols 
L_] ! “1 Lifeline 


—— Synchronous 
‘“. Asynchronous 


Figure 5.35 UML sequence diagram basic representation 
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An activation represents an execution of an operation and its duration is 
indicated by the length of the activation rectangle. Messages are passed 
from one activation to another (or back to itself on the same lifeline) and 
span horizontally from the end of an activation in one object’s lifeline, to 
the beginning of an activation in another object’s lifeline. 


The UML represents the transfer of control from one object to another 
through messages, which start at the top and progress down to the 
bottom. Messages can be sent in three distinct forms: 


e simple — a straightforward transfer of control. 


e synchronous — the sending object waits for an answer from the 
receiving object before it continues with its activities. 


e asynchronous — the sending object does not wait for an answer 
before it proceeds. 


As a practical example, an examination of a CD player already loaded 
and ready to play, highlights the specific activations and messages 
required to interpret the command and display the track number as part of 
the playing operation. The steps which the system goes through when the 
‘play’ button has been pressed, are as follows: 


e The user input software informs the system software that the ‘Play’ 
button has been pressed. 


e The system software notifies the drive motor controller to run up to 
speed. 


e Once up to speed, the motor controller sends a signal to the system 
software. 


e The system software then informs the laser mechanism controller to 
commence reading data from the first track. 


e The laser controller informs the system software that Track 1 is being 
read. 


e The system software notifies the display software to display ‘Track 1’ 
on the display. 


This is represented in the sequence diagram in Figure 5.36. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 5 — Modelling Objects 


depress 


Play button | Software Software Controller Controller Software 


: feedback 


! ee fe edback 
; SS Se 
feedback I 


Figure 5.36 CD player sequence diagram 


As indicated by the message symbols, some of the messages are 
asynchronous and the object sends them without waiting for an answer, 
whilst Some are synchronous, e.g. drive motor up to speed, laser 
confirming reading Track 1, and these wait for feedback before 
progressing. 


Exercise 5.16 [20 minutes] 


Draw a sequence diagram for the CD when the stop button is pressed, 
paying attention to the notation for the message form. 


It can be useful to tie in the sequence diagram to another UML object 
modelling technique, the use case diagram. Remember that use cases 
deal with specific events from the user’s perspective. In the example of 
the CD player, the user presses the play button to instigate the play 
operation. This can be shown by diagramming the system interactions 
onto the use case, as shown in Figure 5.37. 


Depress the 
Play button 


Figure 5.37 The use case diagrammed by the sequence diagram 
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Sequence diagrams come in two forms to deal with real life scenarios: the 
first deals with an action in which one scenario, and only one scenario, is 
possible. This is known as an instance sequence diagram. The second 
deals with actions, following which a number of scenarios can occur. This 
is known as a generic sequence diagram. It is probably useful to 
remember that for every use case, a sequence diagram is used to 
illustrate the activities, sequence and relationships therein. 


A real life example serves to illustrate the difference. When a bank 
customer attempts to draw money from a cashpoint terminal, there are 
two possible scenarios. The first is that in which everything happens as it 
should. His/her card is inserted, the PIN number is checked, the card is 
checked for validity, the bank balance is checked to see if sufficient funds 
are available to cover the withdrawal amount, the cash dispenser checks 
to see it has sufficient funds of the selected denomination and the cash is 
then dispensed. 


In an ideal world, all of the above would be correct and could be 
considered a unique instance, and an instance sequence can occur. 


The sequence follows the following steps: 
a) The customer inserts his/her cashpoint card into the card slot. 


b) The PIN number is entered and the amount of money required 
selected. 


c) The account is checked for available funds. 


d) The dispenser is checked for available funds of the correct 
denomination. 


e) Assuming everything is correct, the card is ejected. 
f) The correct amount of cash is dispensed through the cash slot. 


An instance sequence diagram modelling this ideal scenario is shown in 
Figure 5.38. 
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Insert (Input) 


Select (Amount) 


Figure 5.38 The instance sequence diagram modelling the best case scenario 


Exercise 5.17 [20 minutes] 


Draw an ‘ideal’ sequence diagram for making a call using a telephone 
which stores the numbers in memory. 


Of course, the situation could turn out to be less than ideal and a number 
of eventualities could prevent the cash being issued to order. For 
example: 

e If the PIN number is incorrect. 

e If the customer’s bank account does not contain adequate funds. 


e If the bank’s funds do not cover the amount of cash requested. 


These situations are catered for and are modelled as a generic sequence 
diagram, as shown in Figure 5.39. 
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femal 


| 
Insert (Input) Send (Input) 
| 
| 
| 


[PIN = OK] Request (amount) 
[PIN Not OK] Display (message) 
Send (Amount) 


Select (Amount) 


[Amount > Funds] Display (message) 


Figure 5.39 The generic sequence diagram modelling multiple scenarios 


Exercise 5.18 [15 minutes] 


What other problems could occur in the above cash withdrawal 
scenario? 


Sometimes, when an activity is being carried out, a new object is created. 
If this occurs, the new object can be added to the sequence diagram, but 
not in the usual way. Instead of positioning it as usual at the top of the 
diagram as a named rectangle, it is placed in the diagram at the position, 
and, therefore, at the point in time at which it was created. It is also linked 
to the message which created it in an operation described as being a 
constructor operation. The label for the ‘create’ message is accompanied 
by brackets to imply that it is an operation. As well as using the ‘if 
statement, the ‘while’ statement, which requires the addition of an asterisk 
(*) to the square brackets, can also be used. 


To demonstrate object creation and the use of the ‘while’ (as well as the 
‘if’) statement, consider a typist creating a letter. Because the typist has 
created many letters in the past which are stored on the computer’s hard 
drive, it may be possible to re-use an existing letter as a template. If none 
are appropriate, a new letter will have to be created. Microsoft Word will 
be used to create the letter and save it after completion. The resulting 
sequence diagram will appear, as shown in Figure 5.40. 
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| | 
' start search | 
| search 


found] open file) i 
open and save as (letter) _ 
[not found] riew (file) | create() 


| new and save as (letter 

- : Jexeateg 
*[working] _ | | 
] use Word application 


[done] _ | | modify 


close and save 
close 


closed | 


Figure 5.40 The sequence diagram for creating a letter 


Sometimes, an object needs to invoke itself more than once in order to 
carry out a repetitive task. This is known as recursion (see definition on 
page 5-28) and is used in many other branches of software development. 
The UML copes with this in diagrammatic form, showing the symbol for 
recursion as shown in Figure 5.41. 


activation 


input 
to and from put() 


object that 
initiated the 


recursion ~~. small rectangle 


Note: include all above diagram elements 


Figure 5.41 The UML representation of recursion 


Sequence diagrams are therefore a_ visually effective way of 
encapsulating the sequential and time-based elements of a system or 
process. The use of activations shows where an object executes one of 
its operations and when. Messages of varying types are included to show 
the information being passed from one object to another and operations 
can be classified as either unique, ‘best case’ or generic. Objects can be 
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created as and when desired throughout the process — they do not just 
exist at the beginning of the exercise. 


Collaboration Diagrams 


The section on sequence diagrams explained their use in terms of being 
able to express how objects in a system can be modelled over a period of 
time as they change from one state into another. Collaboration diagrams 
provide an alternative analysis method, and instead of examining objects 
from a time point of view, they do so from a space point of view. In other 
words, the collaboration diagram highlights the context and overall 
organisation of the objects as they interact. 


Whereas a sequence diagram shows the objects and the relations 
between them, the collaboration diagram takes this one step further and 
concentrates on the messages the objects send to one another. As 
collaboration diagrams have a lot in common with sequence diagrams, 
you have to be able to represent the sequence diagram information in a 
collaboration diagram. The sequential information is visualised by 
labelling the message with a number corresponding to its order in the 
sequence. The syntax for collaboration diagrams is shown in Figure 5.42. 


Figure 5.42 UML collaboration diagram basic representations 


The practical example of the CD player demonstrates this further, and to 
recall, the steps that the system goes through once the ‘Play’ button has 
been pressed, are as follows: 


e The user input software informs the system software that the ‘Play’ 
button has been pressed. 


e The system software notifies the drive motor controller to run up to 
speed. 


e Once up to speed, the motor controller sends a signal to the system 
software. 


e The system software then informs the laser mechanism controller to 
commence reading data from the first track. 


e The laser controller informs the system software that Track 1 is being 
read. 


e The system software notifies the display software to display ‘Track 1’ 
on the display. 
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This is represented in the collaboration diagram as shown in Figure 5.43. 


Play 
button 
depress 


feedback() 


sifeedback()_ — 


“i 
eee. 


Figure 5.43 CD player collaboration diagram 


The arrows near the association line show the message type (simple, 
synchronous or asynchronous) and its details, along with the direction to 
the receiving object. The order of the messages is notified by the 
message number preceding the message (and separated by a colon). 
The brackets allow the inclusion of any parameters the operation works 
on. 


Exercise 5.19 [20 minutes] 


Draw a collaboration diagram for a CD when the stop button is 
pressed, paying attention to the sequence and the message form. 


Objects can of course change states during a process and this can be 
modelled in the collaboration diagram by showing the initial state in one 
rectangle, followed by another rectangle showing its new state. The two 
need to be connected by a dotted line which is annotated with the 
<<become>> stereotype. 


Looking at a partial picture of the CD player demonstrates this — see 
Figure 5.44. 
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Play 
button 
depress "become 


1motify(depress) 
—\ 


8:feedback() 
Z 
Software 


Figure 5.44 CD player collaboration diagram 


The collaboration diagram can deal with conditions in the same way that 
the sequence diagrams do. Remember, the process still has to be 
addressed sequentially and to address this, the messages are numbered 
sequentially. Examining the example of the ‘best case’ scenario for the 
cashpoint machine, i.e. the customer puts in his/her card and withdraws 
the cash amount required, the collaboration diagram would look as in 
Figure 5.45. 


insert (input, amount) 


Figure 5.45 Collaboration diagram for ‘best case’ scenario cash withdrawal 


Conditions are represented in the same way as in the sequence diagram 
by putting them inside square brackets and following this with the 
message label. The example of the cashpoint machine illustrates the use 
of sequence numbering and also the use of conditions. In this example, 
the following conditions have to be considered: 
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e The PIN number is incorrect. 
e The customer's bank account does not contain adequate funds. 
e The bank’s funds do not cover the amount of cash requested. 


If the PIN number is wrong, the step in which a message is displayed on 
the front panel needs to be added, along with the conditions. If the funds 
in the customer’s account do not cover the withdrawal amount, a step to 
send a different message to the front panel is included (again with the 
necessary conditions). Finally, if the cash held in the cashpoint (the 
cashstore) does not cover the cash required, a further step needs to be 
added to send another message to the front panel. The collaboration 
diagram to model this example with the conditions stated, is shown in 
Figure 5.46. 


insert (input, amount) 


1:add(input, amount) 
[PIN = OK]4: send(amount) 


[ Dispenser aan [Input = PIN]2: Check(PIN) 


[amount < funds]5.1: check(cashstore) 
[cashstore > amount]6. 1: dispense(amount) 


Figure 5.46 Collaboration diagram for multiple conditions at cash point machine 


Note: After any condition statement, the number before the decimal point 
sequentially states the sequence and the number after the decimal point 
states the alternative branches that can be taken when the condition 
statement is checked. This is called nesting. 


A further similarity with sequence diagrams is that collaboration diagrams 
can also deal with the visualisation of object creation and ‘if’ and ‘while’ 
statements. ‘If’ statements are included in the diagram above. The 
example of the typist creating a new document illustrates both object 
creation and the use of the ‘while’ statement. Object creation requires that 
a <<create>> stereotype is added to the message and the ‘while’ 
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statement requires that an asterisk is placed before the square brackets. 
Recalling the typist scenario, the options available were to either create a 
new letter from a previously saved version, or to start from a new blank 
document. The collaboration diagram would take the form shown in 
Figure 5.47. 


L:startSearch() 

[found] 4.1:open(file) 
[notfound] 4.2:new(file) 
*[ working] 7:useWord() 
[done] 10:closeAndSave() 


2:Search() 


3:result() 


5:openAndSaveAs(letter) 
8:useWord() 


14:done() 


eer 


<<create>> 6:createFile() 
9:modifyQ 
12:closeQ) 


Figure 5.47 Collaboration diagram for typist creating and saving a letter 


All of the above examples have dealt with objects that communicate with 
each other on a one-to-one basis. It could be the case that an object 
needs to communicate with multiple receiving objects and sometimes in a 
particular order. 


These two scenarios are examined in turn, by the use of the following 
example. 


e A burglar alarm controller needs to communicate with several 
proximity sensors located around a property. To represent this, the 
sensors are shown as a stack of multiple objects and the message 
includes a bracketed condition preceded by an asterisk. 


e If the same sensors had to be polled in turn, the same stacked objects 
are used, but this time, a ‘while’ condition is added to the message 
indicating an order for the sensors to be checked. 


The diagram in Figure 5.48 illustrates both scenarios. 
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Figure 5.48 Collaboration diagram showing multiple receiving messages and 
order 


Exercise 5.20 [15 minutes] 


Can you think of other examples where multiple receiving objects 
receive a message and where they need to be received in order? 


Sometimes, a message is used to request that the receiving object 
performs a calculation. For example, when a spreadsheet is being used 
and a cell is filled with a formula to work out the value of two numbers 
held in other cells, by activating that cell, the calculation is performed. The 
syntax for representing this in a collaboration diagram is as follows: 


1: answer:= add(cell1, cell2) 


The name of the returned value is on the left, followed by “:=”" followed by 
the name of the operation and the quantities it is working on. The right 
side of the expression is called a message signature. 


It may be the case that a specific object controls the flow and interacts 
with other passive objects to maintain the workflow. This object is referred 
to as an active object and is shown in the collaboration diagram as a 
rectangle, but to differentiate it from the passive objects, the rectangle’s 
border is made bold. As an example, an office typing pool has an 
administrator whose job it is to receive incoming work and allocate it to 
the typists. The collaboration diagram is shown in Figure 5.49 with the 
administrator shown as an active object. 
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1: typingRequest(letter) 4: lookUp(report) 
2: getRequest(report) 


Figure 5.49 Collaboration diagram highlighting an active object 


Another aspect of collaboration diagrams which requires attention is that 
of synchronisation. This situation arises when an object can only send a 
message when, and only when, several other messages have been sent. 
In effect, it has to synchronise its message along with a set of other 
messages. AS a practical example to demonstrate this, a house building 
company has the following process in place for building a new housing 
estate: 


a) The architect who has designed the estate, asks the quantity 
surveyor to calculate the materials and labour to be used. 


b) The quantity surveyor asks the construction manager to produce a 
bill of materials to collate all the materials required. 


c) The construction manager asks the buyer to obtain quotations for 
the materials from a local supplier. 


d) The buyer asks the local supplier for a quotation. 


e) After the construction manager has produced a bill of materials, the 
accounts department require a copy to budget for the development. 


The key point here is that step e) cannot be undertaken until steps b) and 
c) have been completed. In other words, the system needs to 
synchronise. The collaboration diagram syntax is to show step e) asa 
separate process. This time, instead of preceding the message with a 
number, it is preceded with a list of the messages that have to be 
completed before step e) can take place. A comma separates one list 
item from another and the list ends with a slash (/). The diagram in Figure 
5.50 illustrates this scenario. 
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1: calculate(materials, estate) 


Figure 5.50 Synchronisation between objects in a collaboration diagram 


Exercise 5.21 [15 minutes] 


Draw a collaboration diagram showing the synchronisation required for 
planning a holiday. 


The analysis of the UML undertaken so far has examined objects from a 
static point of view. Items covered include its features and structures and 
how to represent the basic building blocks through diagramming 
techniques. In order to present a more realistic picture however, it is 
necessary to examine what happens in the real world. 


For systems to function and to perform operations, things or objects do 
not remain static, they change. Change is normally triggered by some sort 
of action or interaction and usually over a period of time. The UML 
addresses this dynamic or behavioural element with another set of 
building blocks and diagramming techniques, which form the focus of the 
next section. 


State Diagrams 


Any system exists to perform a function and this function usually takes 
place over a period of time. As activities are performed, the system 
changes from state to state and to be able to understand these changes, 
it is useful to know the following: 


e The condition of the object prior to any action. 


e The event or stimulus that causes the object to undergo a change of 
state. 
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e =The activities which bring about the change of state. 


e The outcome of the change activity and thus the object’s final 
condition after the event. 


The UML state diagram is the method by which dynamic change in 
objects is represented and the basic symbols used are shown in Figure 
5.51. 


state symbol transition symbol 
eo 
starting point of UE 
&P end point of 
a sequence 
a sequence 


Figure 5.51 UML symbols in a state diagram 


Just as with class icons which can be sub-divided to show their name, 
attributes and operations, the state symbol can be similarly sub-divided to 
represent its name, the state variables and the activities which can occur, 
as shown in Figure 5.52. 


Name 


State 


Variables 


Activities 


Figure 5.52 Symbol for the state icon showing details 


State variables can be seen as the equivalent of a class’s attributes and 
state activities equivalent to a class’s operations. As an example, consider 
an electronic camera. 


e It takes pictures with the date and time shown on each exposure but 
requires the camera to be switched on and the button to be 
depressed before this will happen. 


e Other variables include the battery level, the exposure speed, the 
aperture setting and whether the flash is required or not. 


This situation would be represented as in Figure 5.53. 
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Photographing -\ 


Date = Current Date 

Time = Current Time 

Exposure Number = Film Exposure Number 
Battery Level = Current Battery Level 
Exposure speed = Current Exposure Speed 
Aperture Setting = Current Aperture Setting 


entry/press shutter button 
exit/complete taking picture 
do/add datestamp 

do/add timestamp 

do/open shutter 

do/close shutter 
do/advance film 


Date = Current Date 
Time = Current Time 

Exposure Number = Film Exposure Number 
Battery Level = Current Battery Level 
Exposure speed = Current Exposure Speed 
Aperture Setting = Current Aperture Setting 


entry/taking picture complete 
exit/begin next shot 

do/show battery level 
do/show exposure speed 
do/show aperture setting 


Figure 5.53 The camera as an example of a state with variables and attributes 


Note the appearance and syntax for the activities ‘entry’, ‘exit’, and ‘do’. 
These events are used frequently and highlight what happens when the 
system enters the state, when it leaves the state and when it is in that 
state, respectively. The arrow represents the symbol of a transition from 
one state to another. 


Exercise 5.22 [20 minutes] 


Draw a state diagram for a portable CD player moving from the state of 
not playing to playing a particular CD track. 


Exercise 5.23 [20 minutes] 


In what way does a state diagram differ from a class, object or use 
case diagram? 


Of course, a system does not change from one state into another without 
something effecting that change. It requires something to bring about the 
change of state and this can come either from inside or from outside the 


5-51 


Chapter 5 — Modelling Objects Programming Methods 


5-52 


system. An event external to the system is termed a trigger event, such as 
pressing a switch or button. One originating from inside the system is 
termed an action, for example, a software routine that sends a signal for 
something to happen. Looking at a CD player as an example to 
demonstrate the process, the diagram in Figure 5.54 shows the CD being 
switched on, a trigger event, followed by a transition to an internal self 
check, an action, after which the CD changes to working mode. By 
switching the player off, another trigger event, the player changes to the 
shutting down state. 


do/Selfcheck 


Turn CD on 


Figure 5.54 The CD player during the switch on/switch off process 


A more expensive CD player may have the facility to automatically 
change to standby, a power-saving mode, if not used for 30 minutes. This 
is known as a guard condition which, when it occurs, allows the transition 
to the ‘standing by’ state to take place. This is shown in Figure 5.55. 


Shutting Down 


Turn CD on Shut Down 


Activate CD 


is Timeout 
[ | controls 


Standing by 


Note: written as 


Boolean 
Figure 5.55 The CD player state diagram showing the guard condition 


The CD player’s working state finds it in preparation for functional activity 
and this needs to be examined more closely. Within this working state, it 
is awaiting some kind of input to occur and once this happens, the player 
can take the appropriate action. These substates as they are known, can 
either occur sequentially, i.e. one after the other, known as sequential 
substates, or in parallel, Known as concurrent substates. 
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Diagrammatically, and again using the CD player as an example, these 
two substates are represented in Figure 5.56. 


It is worth noting that any state that breaks down into substates, whether 
sequential or concurrent, is termed a composite state. 


Therefore the working state in Figure 5.56 is a composite state. 


Sequential substates 


Awaiting 
User Input 


Enabling 
User Input 


Registering 
User Input 


Concurrent 
substates 


[is Track Different] 


Figure 5.56 Representation of sequential and concurrent substates 


For the sake of continuity, it is important that when an object changes out 
of a composite state, it remembers its active substate. 


In the case of the CD player, it must return from the standby state in 
exactly the same condition it was in when it changed out of it. 


The UML caters for this by remembering its history state: 


e The “H” icon shown in Figure 5.57 represents this. This occurs if the 
history is shallow, i.e. it only needs to remember the highest nested 
substate. 


e If, however, it has to remember substates nested within other 
substates, it is referred to as deep, and is represented by H*. 
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Enabling 
User Input 


Awaiting 
User Input 


Registering 
User Input 


[is Track Different] 


[Activate CD 


[Lis Timeout] 
controls] 


Standing By 


Figure 5.57 Inclusion of history icon “H” 


For exit from the standing by state to happen requires some kind of 
message. Such messages are known as signals and, like other elements 
that make up the object domain, can be treated as an object with 
attributes and therefore, inheritance properties. In the CD player example, 
such a signal would be the user opening the CD drawer. 


Note: States which have no state variables and no activities such as the 
history state, are termed pseudostates. 


Exercise 5.24 [30 minutes] 


Draw a state diagram for a microwave oven in which the history state 
comes into play. 


Activity Diagrams 


Activity diagrams are perhaps most familiar to those individuals who have 
undertaken courses in programming techniques. Programmers would be 
encouraged to diagram their designs into flowcharts which provide a 
logical visualisation of the programme and which help them keep track of 
the processes that need to be dealt with. 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods Chapter 5 — Modelling Objects 


In the UML, activity diagrams perform the same function, albeit more 
comprehensively. They track activities as well as decision points and 
branches. In effect, they are a simplified view of what happens during 
operations and processes. Referring back to the state diagrams which 
showed the states of an object and their connections, activity diagrams 
concentrate on the activities taking place. 


Activities themselves are visualised in the UML as more rounded 
rectangles than those representing states, and arrows represent the 
transitions from one activity to the next. Each diagram has a start point 
and an end point and, as well as linear flows, can also deal with 
decisions. 


Decisions can be represented in two ways and the diagram in Figure 5.58 
shows the basic visual elements of activity diagrams as well as the two 
forms of decision point representation. 


Start point 
[template] [no template] 
[template] xX [no template] 
Activity 2 } Use (template) Use (template) 
End point 
V v Vv 
oe e e 
Basic symbols Alternative ways of showing a decision 


Figure 5.58 Basic elements of an activity diagram and decision point 
representation 


Figure 5.58 shows linear transitions, i.e. activities which take place one 
after the other. Sometimes the system being modelled requires that 
certain activities need to happen at the same time, i.e. concurrently. The 
UML copes with this by including a solid thick line from which the paths to 
the concurrent activities emerge and then, following the activities, another 
solid thick line to bring the separate paths back together. Figure 5.59 
highlights this. 
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Take Picture 


Enable Flash Enable Shutter 


Figure 5.59 Representation of concurrent activities 


Exercise 5.25 [20 minutes] 


Would some activities occur concurrently when pressing the CD player 
to ‘Play’ and if so, how would you illustrate this in an activity diagram? 


It is also possible, following an activity, to send and receive signals and 
annotate them in the activity diagram. The first signal encountered would 
be an output signal and the second, an input signal. These are shown in 
Figure 5.60, using the example of the CD player. 


Panel. keyIn(track) 


: Output signal 
Input signal 


Figure 5.60 Representation of input and output signals 
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When applying activity diagrams, they can be used to model both 
operations and processes. Two examples illustrate the way these are 
handled. 


The first example deals with a software operation that calculates a 
numerical series. The series is calculated by the sum of the previous two 
numbers in a number series to arrive at the next number. 


The first two calculations automatically become 1, as there are not two 
numbers before them. The next number is 2, so the calculation can be 
performed, i.e. 1+2=3, and the next becomes 3+2=5 and then 5+3=8. The 
resulting series is: 1,1,2,3,5,8,13,21...and so on. The operation activity 
diagram is shown in Figure 5.61. 


calculateNumber(n) 
Format: 
"The " Counter"th 


Number is:" Answer 


[n>1] 


print(Answer, Counter) 


Answer:=Answer1+Answer2 
Counter:=Counter+1 


[n>Counter] 


Answer1:=Answer2 
Answer2:=Answer 


Figure 5.61 Activity diagram modelling an operation 


As can be seen, the activity information is explicitly stated inside the 
rounded rectangles with the decision points indicated by diamonds and 
the arrows representing the transitions. 
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Exercise 5.26 [20 minutes] 


How would you represent the activity diagram for an operation to 


calculate a sequence of numbers which adds ‘1’ to the previous 
number i.e. 1,2,3,4,5,6 etc? 


Looking now at how a process is represented, we will refer back to the 
example of the typist creating a document. Let us assume that the typist 
wants to add graphics and tables to the document. The process is 
reasonably straightforward and can be visualised in the diagram as shown 


in Figure 5.62. 


Open Word Package 


Type the Document 


| [graphics needed] 
[graphics not re 


[tables needed] 


Open and Use 
Graphics Package 


Open and Use 
Spreadsheet 


© 


[tables not needed] 


Save the File 
Print Hardcopy 


Figure 5.62 Activity diagram modelling a process 
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Normally, when a system which involves people is being designed, 
responsibilities are allocated to specific personnel. Activity diagrams 
handle this using a technique called swimlanes. 


Consider the tasks to be undertaken and allocated when setting up a 
training course. The diagram in Figure 5.63 shows a simple activity 
diagram and the self-explanatory steps in preparing and carrying out the 
course. 


Prepare Materials 


[require OHP] [require video] 


Assemble Delegates 


Figure 5.63 Simple process activity diagram for preparing a training course 


The diagram as it stands, does not make clear who has responsibility for 
each activity. This is done by using parallel lanes or ‘swimlanes’ in which 
each swimlane represents an activity channel for each participant. The 
diagram in this form would be as shown in Figure 5.64. 
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Technician 


Prepare Materials 


[require OHP] [require video] 


Prepare OHP 
Assemble Delegates |4 


Deliver Course 


Figure 5.64 Process activity diagram showing swimlanes 


Exercise 5.27 [20 minutes] 


How would you show an activity diagram, in swimlane form, modelling 


the process for organising a wedding, showing the responsibilities of 
some of the participants? 


8.6 How Diagrams Fit Together 


The previous sections have provided an overview of all of the major 
components that go to make up the UML modelling technique. The 
technique provides all involved parties (the stakeholders) i.e. the analyst, 
the programmer, the client and the end user, with a common model from 
which to observe, review and implement a system. 


This can be compared to the documents used in the building trade when 
preparing to erect buildings. The architect's drawings, the quantity 
surveyor’s bill of materials and the land surveyor’s site survey are used in 
combination to correctly build the property. The same is true of the UML 
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model — it is the central repository to which all parties can refer to 
correctly build a system. It is important to remember that the model 
describes only what the system is supposed to do and not how to 
implement it. Each stakeholder can view the system in his/her own way 
and learn what they need, relevant to their own interests. 


The purpose of the model of course, is to provide the programmers with a 
way of understanding the system, giving them the information needed to 
code the programme quickly and efficiently. As with most system 
development exercises, the process can be broken down into a number of 
stages (even though the word ‘stage’ implies that one has to be 
completed before the other can start, which is definitely not the case). 
These stages are: 


e Requirements gathering. 
e Analysis. 

e = Design. 

e Development. 


e Deployment. 


The latter two stages, i.e. development and deployment, are effectively 
the domain of the programmers and the implementers, i.e. those who 
deploy the finished system onto hardware platforms. The first stages, i.e. 
requirements gathering, analysis and design, are where the UML comes 
into its own as an effective tool. Looking at each in turn will give a good 
indication of the role of the various diagram types in the system 
development process. 


Requirements Gathering 


This is probably the most important element of the entire process. If 
analysts fail to understand the requirements of the system, the chances of 
it turning into an effective solution are very slim. Requirements gathering 
breaks down into several subsections as follows: 


Discover Business Processes 


The analyst interviews the client or his/her representative to fully 
understand their requirements. By going through these processes one by 
one, the analyst gains an understanding of the environment and thus the 
prevalent vocabulary or terminology. The outcome is a set of activity 
diagrams which encapsulate the steps and decision points in the 
processes. 


Perform Domain Analysis 


This activity helps the analyst to further understand the working domain 
and aims to discover the entities existing therein. By identifying the nouns 
used, the analyst can start to identify the objects to create a high-level 
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class diagram. Some of the nouns may eventually become attributes, but 
that will be discovered at a later stage. The verbs discovered will 
eventually become the operations of the classes. 


Note: It is important to identify the system boundary at this stage because 
no one system works in isolation. Acknowledgement must be given to 
links that link the domain to the outside world. 


Discover System Requirements 


Stated simply, this activity is performed by the analyst by bringing 
together the key representatives of the client organisation. Collecting all 
the representatives together provides the opportunity for discussion and 
conflict resolution with the aim of arriving at an agreed view of what they 
collectively want from the system. The outcome is a series of package 
diagrams, each of which represents a high-level area of system 
functionality and groups together a set of use cases. 


Analysis 


Once the requirements have been recorded and agreed between the 
analyst and the client representative, analysis can begin. This activity also 
breaks down into a number of sub-activities as follows: 


Understand System Usage 


This is effectively a high-level use case analysis wherein the 
development team work with potential users to discover the main actors 
who will initiate or benefit from each use case. Other use cases (new or 
abstract) may be identified in the process, and the outcome of this 
exercise will be a set of use case diagrams, showing the actors and any 
stereotyped dependencies (<<extends>> and <<includes>>) between 
use cases. 


Flesh Out Use Cases 

The objective here is to analyse the sequence of steps in each use case 
and produce a text-based description of the steps in each of them. 

Refine the Class Diagrams 


As the development team gain a deeper insight into the requirements, 
they can begin to refine the class diagram and start to add such elements 
as the names of associations, abstract classes, multiplicities, 
generalisations and aggregations. 


Analyse Changes of State in Objects 


A further step, in which the model is refined further by showing changes of 
state wherever necessary. The outcome of this activity is a state diagram. 
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Define the Interactions Among Objects 


At this stage, the development team has a set of use case diagrams and 
a refined class diagram. They can now define how the objects interact, by 
developing a set of sequence diagrams and a set of collaboration 
diagrams which should illustrate the interaction. Any state changes should 
be included at this stage. 


Analyse Integration with Co-operating Systems 


Whilst the work to date is being undertaken, the systems engineer can 
discover details of the integration with co-operating systems such as 
communication involved, network architecture and if the system requires, 
access to databases. If so, the database architecture design can 
commence. The outcome of this activity is the detailed deployment 
diagrams. 


Design 


At this stage, the development team can take the output from the Analysis 
stage to begin to design a solution. This is usually done iteratively until the 
design is effective and complete. 


Design and Refine Object Diagrams 


At this stage, the programmers take the class diagrams and generate any 
necessary object diagrams. They refine the object diagrams by examining 
each operation and generating a corresponding activity diagram. These 
activity diagrams will provide the basis for the coding in the Development 
stage. 


Develop Component Diagrams 


The aim here is to visualise the components that will result and show the 
dependencies among them. The component diagrams are the outcome of 
this activity. 


Plan for Deployment 


If the system being developed has a computer interface that users will use 
to access the system, an interface specialist will work with the users to 
generate prototypes of the interfaces for their inspection and comment. 
The aim is to develop user-friendly interfaces that will be used effectively. 
The outcome of this activity is a set of interfaces appropriate to the client 
requirements. 


Design Tests 


Use cases enable the design of tests for the software, with the objective 
of establishing whether the software will perform as it is Supposed to. 
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Development 


Construct Code 


Now that all previous stages have been completed, the programmers can 
begin to construct the code for the system. They have access to the class 
diagrams, the object diagrams, the activity diagrams and the component 
diagrams, all of which will be referred to as the code is constructed. 


Construct User Interfaces, Connect to Code and Test 


Using the prototype interfaces developed, the interface specialist can test 
them out by connecting the interfaces to the background code. 


Following all of these steps, Deployment can begin to install and test the 
system against the requirements, and for robustness. It is apparent that 
UML modelling has played a significant role in the entire system 
development process and provides a reference to which any stage of this 
development can be mapped and compared. The whole process is not 
wholly sequential, but can involve many iterations between one stage and 
another, in order to continually refine the system for optimum performance 
and ability to function to the client’s satisfaction. 


9 Summary 


In this chapter we have covered: 


e Why the Unified Modelling Language (UML) was developed. 
e The advantages and disadvantages of the UML. 
e Basic notations and diagrams. 


e Use cases including scenarios, actors, relationships and business 
and system use cases. 


e Class diagrams including subtypes and generalisation, associations 
and attributes and operations. 


e Interaction diagrams including sequence diagrams and collaboration 
diagrams. 


e State and activity diagrams. 


You should now have a comprehensive understanding of how the UML 
works and fits together. Essentially, these diagramming techniques offer 
a range of views of a system, which is necessary, as one view or 
perspective alone would not indicate everything you need to consider 
before constructing a system. 
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Note that the UML is flexible in that you do not have to use all the 
techniques in all circumstances, or even in great depth. The point is to 
use enough for you to gain value from the exercise. 


This chapter provides a brief introduction to UML diagrams; it is an 
extensive topic and you should try to extend your knowledge through 
reading and practice. 


10 Self Study 


These exercises and self study recommendations are designed to help 
you learn more about modelling objects. The exercises consist of: 


1. Recommended reading. 
2. Internet research on key topics. 
3. Activities. 
A. Review questions. Use them as follows: 
e Work through the questions and jot down your initial answers. 
e All the answers are contained in the text of the chapter. Go 
back and review the text to check the accuracy of your 
answers. Where an answer is not correct or incomplete, enter 


the correct answer against the question and use this for 
revision or for retesting at a later date. 
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Self Study 1 


Match the image to the definition 


A. 


. GENERALISATION 
. PACKAGE 

. DEPENDENCY 

. ACTOR 


. SYSTEM BOUNDARY 


» <<INCLUDE>> ASSOCIATION 


. STEREOTYPE 


. USE CASE 


Self Study 2 [120 minutes 


A much recommended technique for scoping the domain of the system 
you will be programming is the interview. The approach is to conduct 
an interview with someone who is knowledgeable about the people and 
processes involved. The objective is for you to have as clear a picture 
as possible of the domain covered by the proposed system. To do this, 
you must ask questions to prompt your interviewee for the information 
you need. If you were interviewing someone about the check-in 
process in our airport scenario, your interviewee might say 
“passengers check-in their luggage at the airline desk”. 


Write down what you are told in English. Then look at the nouns and 
verbs used in the conversation. As a starting point you can pick the 
nouns as classes and verbs become operations so, in our example, 
‘passengers’, ‘luggage’ and ‘airline desk’ are all nouns (therefore may 
be class objects) and ‘check-in ’ is a verb (therefore an operation). It is 
an effective way to begin thinking of what is involved in an object- 
oriented system. 


Try this as an exercise with another person. Pick a domain about which 
your interviewee is knowledgeable, carry out an interview and make 
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| notes of the processes involved. Check the nouns and verbs in your | 


notes. See if you can then identify objects, attributes and operations as 
a result. 


A more detailed explanation follows: 


The process usually begins with client interviews. These interviews will 
produce class diagrams that can form the basis of your understanding 
of the system’s domain. These also give you a basic terminology to use 
when interviewing the system users. 


Next, user interviews are conducted which begin with determining the 
domain terminology. Domain terminology refers to the conceptual 
terms used by clients for their particular domain or ‘world’. There are 
many examples in the real world, such as sport-related terms and 
ideas. 


This is followed with sessions on identifying user terminology. User 
terminology refers to those words or terms used by the people using 
the system in relation to their interaction with it, and which may be 
specific to their jobs. 


The purpose of this session is to identify the system’s actors and the 
high-level use cases. These high-level use cases form the general 
system requirements as well as the boundaries and scope of the 
system. 


Subsequent interviews will expand on the basic requirements that have 
already been defined. These interviews should reveal the relevant 
scenarios and sequences in detail. 


In addition, use cases which define relationships, such as inclusion and 
extension, will probably be added. These use cases will be derived 
from the classes derived from the initial interviews. A less detailed 
understanding will result in too many use cases and too much detail, 
which will hinder the design and development process. 


From your own interview, develop some use cases and draw diagrams 
to demonstrate them. 


Self Study 3 
What do class diagrams achieve that use cases do not? 
Why are class diagrams central to object-oriented methods? 


What do class diagrams describe? 


4. What is the difference between a class diagram and an activity 
diagram? 
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Name the types of object which may feature in class diagrams. 
What are the two principal kinds of static relationships? 
What is meant by static relationship? 
What kind of relationship is implied by the term inheritance? 
In a parent/child relationship which is the subtype? 
. State the steps you need to go through to build class diagrams. 
. Why is it important to define links between objects? 
. What notation is used for class diagrams? 
. How do you define domain? 
. What are attributes and operations? 


. At what stage in the class diagrams process are these details 
completed? 


16. What types of link can be defined in class diagrams? 
17. What techniques can you use to decide the class of an object? 


18. Lion, elephant, cheetah and dog are all instances of which abstract 
super class? 


19. What do subtypes inherit from superclasses? 


20. What is an object connection or link? 


21. What is meant by a role name? 


22. What is the difference between unidirectional and bi-directional 
navigability? 


23. What is multiplicity? Give examples. 

24. Name the two special variations of association. 

25. What is meant by aggregation? 

26. What is meant by composition? 

27. How are constraints represented in class diagrams? 


28. Class icons are divided into three panels. What information is 
entered into the top, middle and bottom panels? 
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29. Which naming convention is used for attributes and operation 
descriptions? 


30. Which data types are associated with attribute names? 


Self Study 4 
Interaction diagrams: 
Which two diagram types make up interaction diagrams? 
What do sequence diagrams show? 
What is featured in collaboration diagrams and what do they show? 


In sequence diagrams, what is meant by lifeline? 


Self Study 5 
What is meant by activation? 
What is meant by messages? 
What forms can messages take? 
What are the two forms sequence diagrams take? 
What are the differences between the two forms? 
What additional information is in a generic sequence diagram? 


7. Is the sequence diagram equivalent to the ‘if and ‘while’ 
statements? 


8. Define recursion. 


9. Howis recursion indicated in UML diagram forms? 


10. Do all objects have to be created at the beginning of a sequence? 
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Self Study 6 

Collaboration diagrams: 
What do collaboration diagrams show? 
How do you show sequence in collaboration diagrams? 
How are conditions represented in collaboration diagrams? 
What is the notation for conditions in a collaboration diagram? 
Is there a limit to the number of conditions? 


How do collaboration diagrams visualise ‘if and ‘while’ statements? 


7. How do collaboration diagrams show objects communicating with 
multiple receivers? 


8. How do diagrams show objects communicating with multiple 
receivers in order? 


9. What is meant by a message signature? 
10. What is meant by an active object? 


11. What is meant by synchronisation? 


Self Study 7 


1. As the system changes from state to state, what is it important to 
know? 


What do state diagrams represent? 

What are the elements of notation for state diagrams? 
Is a state variable equivalent to a class? 

Is the state activity equivalent to a class? 

What do the events ‘entry’, ‘exit’ and ‘do’ signify? 
What is an event? 

Where can a change of state come from? 

What is an external event known as? 


10. What is an internal event known as? 
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. What follows a trigger event? 

. What is meant by a guard condition? 

. Define substate, sequential and concurrent substates. 
. What is a composite state? 

. What is a history state? 

. What is meant by a shallow and/or deep history state? 
. How is this shown in diagrams? 

. What do signal messages do? 


. What are pseudostates? 


Self Study 8 

Activity diagrams: 
What is the difference between state and activity diagrams? 
What can activity diagrams deal with? 
How do activity diagrams show concurrent activities? 


. How do you show the sending and receiving of signals in activity 
diagrams? 


5. What is the difference between the modelling of processes and the 
modelling of operations? 


6. What do swim lanes show? 


In all exercises, you are encouraged to practise drawing diagrams as much 
as possible. You do not need software for this purpose — hand drawn 
diagrams will do. 
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1 Learning Outcomes 


On completion of this chapter you will be able to: 


e Explain the need for thorough testing once a program is coded. 


e State, with examples, how unit, integration, system, acceptance, use 
case diagrams and installation testing differ. 


e List the five components of a test description. 

e Describe the methods of debugging. 

e Write atest plan for a program and produce suitable test data. 

e Desk check a program and dry run the corresponding code. 

e Appreciate the problems and techniques of program maintenance. 


e Use diagnostic aids generated during compilation. 


2 Introduction 


This chapter concerns the importance of testing new aspects of systems 
under development, particularly software code, to reduce the risk of faults 
appearing when a system or application goes live. 


Some introductory points which need to be understood about the testing 
process are: 


e a planned and systematic approach is most effective, rather than a 
haphazard approach; 
e testing can be either manual or automated; 


e there are a variety of purposes for which testing is carried out. 


Ideally, code should be checked against formally established written 
criteria by testers carefully chosen for their competence in this process. 


e They must be able to check code by reading through it and 
deciphering how it performs, or be able to use software tools to run 
through code and spot mistakes. 


e They must also know the objectives of the testing, for example, to 
check accuracy of outputs, the logical flow of the program or 
performance levels of the system. 
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Essentially, testing involves the operation of a system or application under 
controlled conditions, and an evaluation of the results to ensure the 
software does what it is supposed to do. For example, if the user enters 
names and addresses into a customer database running on a PC 
network, then closes the file, the new data should be saved to the server. 
This outcome can be tested in a controlled way, using the appropriate 
hardware, software and data and observing the results. If the data is 
saved, the software works correctly. If not, it is flawed and the faults must 
be identified and corrected. 


Testing under controlled conditions should include both normal and 
abnormal scenarios. Testing should intentionally attempt to create 
problems, in order to determine the impact of errors. For example, you 
may limit the entry of a password to six characters. Test what happens if 
you enter seven characters, or numbers rather than characters. 


3 Why Test Software? 


Software may contain bugs for many reasons, such as: 


e ~~ Lack of effective communication on what the software should do. 


e The complexity of the software, which is increasing with modern 
systems. 


e Errors by programmers. 

e Requirements changing during a project. 

e Pressures of timescale. 

e Programmers underestimating the difficulty of the tasks involved. 

e Code not being properly documented. 

e Use of software development tools which themselves contain bugs. 


e Miscommunication or no communication — regarding the specifics of 
what an application should or should not do (the application’s 
requirements). 
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Exercise 6.1 [30 minutes] 
Why may: 


software complexity; 
underestimation by programmers; 
poorly documented code; 
changing requirements; 
time pressures; 

lead to error prone software programs? 


Write down why these are causes of software error. 


There are risks associated with the development and delivery of any 
computer system. The major risks are that the system will: 

e produce incorrect results; 

e allow unauthorised transactions; 

e lose computer file integrity; 

e be unable to reconstruct processing; 

e lose continuity of processing; 

e deteriorate in performance to an unacceptable level; 

e compromise security; 

e not comply with organisational policy or governmental regulation; 
e produce unreliable results; 

e be difficult to use; 

e not be portable to other hardware and software; 


e be unable to interconnect with other computer systems. 


Testing is the means to detect the presence of any of these undesirable 
conditions and so prevent problems occurring. 


Organisations increasingly view testing seriously because technology is 
now so closely integrated into the day-to-day running of a business, that 
businesses cannot operate without computer technology. Computer 
systems are connected to supplier chains so that problems in one system 
can cascade into and affect others. The ‘knock on’ effect of just one 
problem condition, such as a wrong price, can cause hundreds or even 
thousands of similar errors within a few minutes. 
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Organisations assign responsibility for testing in different ways. For 
example, responsibility could be assigned to one group or individual, or 
perhaps teams which include a mix of testers and developers, working 
closely together. Overall processes may be monitored by project 
managers. It will depend on what best suits an organisation’s size and 
business structure. 


Overall, it is important to have a planned and systematic approach to 
testing and solving the problems of flawed software. Testing aims to 
discover and eliminate defects or variance from what is expected. The two 
types of defects are: 


e Variance from specifications. A defect from the perspective of the 
build of the product. 


e Variance from what is desired. The defect from the user (or 
customer) perspective. 


To ensure that the risk of such variations is reduced, a structured 
approach to testing should be adopted which involves testing in every 
phase of the software development lifecycle and not only immediately 
prior to operation and maintenance. Throughout the time a program is 
specified, designed and coded, it can be checked by a system of 
verification and validation (V and V) procedures, to ensure that at each 
stage of production the transformation from one state to another, e.g. 
system analysis to design, has included all that was intended, and no 
errors have been introduced. 


Definition: Validation and Verification (V and V) 


A generic term for the complete range of checks to be performed ona 
system in order to increase confidence that the system is suitable for 
its intended purpose. 


The verification aspect involves an objective check to see how the 
system conforms to a well-defined specification. 


The validation aspect involves a more subjective assessment of likely 
suitability to the intended environment. 


The verification process answers the question, “Have we built the 
correct system?” while the validation process addresses “Have we built 
the system correctly?” 


The testing processes used throughout the software development 
lifecycle can be either: 


e static e.g. reading and checking documents concerning requirements, 
reading and checking code without running software 


e dynamic e.g. running code to check outputs. 
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How appropriate it is to use one type rather than the other will depend to a 
large extent on the phase of the project. Clearly, in the earlier stages of 
the software development lifecycle, not much code will have been written, 
therefore dynamic testing is not possible and static testing is appropriate. 


The verification process, (checking that developers are building the right 
product) typically involves reviews and meetings to evaluate documents, 
plans, code, requirements, and specifications. This entails: 


e checklists; 
e issues lists; 
e ~=walkthroughs; 


e inspection meetings. 


Observation of typical error levels indicates that the majority of errors 
occur at the requirements specification stage of program development 
and that, if not identified early in the development of a program, will result 
in much work to correct at a later stage. 


The validation process typically involves actual testing and takes place 
after verification is complete. 


Testing is used to: 


e demonstrate the validity of the software at each stage in the system 
development lifecycle; 


e check that the final system meets user needs and requirements, as 
specified; 


e examine the behaviour of the software or the system by using sample 
test data. 


Defects remain undetected either through not looking, or looking but not 
seeing. ‘Not looking’ is where tests are not performed because a 
particular test condition remains unknown. ‘Looking, but not seeing’ is like 
losing a personal possession only to discover it was in plain sight all 
along. This can happen particularly if you test your own code — sometimes 
familiarity leads to oversight and errors go unnoticed. 


However, it should be pointed out that testing has its limitations. No 
amount of testing can improve a program, nor actually prove that it is 
accurate. In fact there are severe limitations to the level of accuracy 
testing can demonstrate, but there is currently no practical alternative. 
Testing provides the only way of even attempting to demonstrate that a 
program meets the conditions and requirements laid down in the original 
program specification document. 
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Study Note 


IBM has demonstrated that an application system during the system 
development lifecycle (SDLC) will produce 60 errors (defects). Testing 
prior to coding is 50% effective in detecting errors, and after coding, 


80% effective. This study and others showed that it is at least ten times 
as costly to correct an error after coding as before, and 100 times as 
costly to correct production error. 


Source: Perry (2000) 


The cost of defect identification and correction increases as the project 
progresses. 


4 Documentation of Tests 


It is important that tests are described fully before they are carried out. 
Anyone carrying out a test without first describing the expected outcome 
is not testing, but experimenting. 


The description should include: 


e the identity of the component to be tested; 


e the purpose of the test (i.e. which function of the component is to be 
tested); 


e the conditions under which the test is to be carried out (i.e. values of 
any global conditions, status of files etc.). 


e the test data to be used; 


e the expected outcome (i.e. new values of global variable, new file 
status, etc.). 


These are the five key components of a test document, and they are 
followed by the recording of the actual outcome. 


The word global, as used above, refers to those areas in the module 
under test which are shared with the rest of the program. 


The type and range of test data, along with the other details, can be 
drawn up as the module is being designed. 


It is important to keep in mind that the test data should be developed from 


the design and not from the implementation. This way, the design is 
automatically re-checked to ensure that it remains unchanged. 
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Exercise 6.2 [30 minutes] 


Name the details which must be included in a test plan. Why are these 
particular features so important? Design a form for test planning. 


Include the key headings and against each heading write a description 
of the type of information required. What happens if any of the 
headings are omitted? 


5 Levels of Testing 


There are five levels of testing: 


e Unit testing — This is the lowest level of testing; it involves testing 
each module of a program in isolation. Unit testing is a ‘micro’ scale of 
testing and involves the testing of particular functions or code 
modules. Typically it is carried out by the programmer and not by 
testers, as it requires detailed knowledge of the internal program 
design and code. It is not always easy to do and requires the 
application to have a well-designed architecture with a tight code; it 
may require test driver modules or test harnesses to be developed. 


e Integration testing — This is a test to see if the linkages between each 
tested module work correctly; incremental integration testing involves 
continuous testing of an application as new functionality is added. It 
requires that various aspects of an application’s functionality be 
independent enough to work separately before all parts of the 
program are completed, or that test drivers are developed as needed; 
it is carried out by programmers or testers. 


e System testing — This tests the system as a whole to ensure that the 
components fit together properly. The components can be code 
modules, individual applications, client and server applications on a 
network, etc. In small scale systems, integration and system testing 
are frequently combined, especially if the smallest unit of code being 
produced is a program. 


e Acceptance testing — This is the final testing based on the 
specifications of the end-user/customer, or on the use by end- 
user/customer over a limited period of time. It determines if the 
customer finds the software satisfactory. 


e Installation testing — This is the testing of the system as a whole as it 
goes live on the customer’s hardware and in the customer's 
environment. 


Acceptance testing and installation testing both occur in the final stages of 
program testing. 
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As stated earlier, unit testing involves the testing of small units of code, 
e.g. programs, modules or even procedures, in order to ensure that they 
carry out their intended functions. It takes place after the code has been 
produced, but before any integration or system testing has begun. 


The amount of testing needed will vary in proportion to the size and 
complexity of the module being tested. As a rule of thumb, those modules 
containing the most lines of code (excluding comments) are the likeliest to 
cause problems and therefore need to be tested more thoroughly. More 
complicated methods of measuring program complexity do exist, but are 
beyond the scope of this programme. 


It is also worth recording the number of errors found during walkthroughs 
and other checking procedures of the code for each module. This number, 
relative to the lines of code in the module, will indicate which modules are 
error prone and are going to need most testing. 


Definition: Walkthrough 


A product review performed by a formal team. A number of such 
reviews may be carried out during the lifetime of a software project, 
covering, for example, requirements, specification, design and 
implementation. 


The review is formally constructed; there is a clear statement of the 
contribution that each member of the review team is required to make, 
and a step by step procedure for carrying out the review. The person 
responsible for development of the product ‘walks through’ the product 
for the benefit of the other reviewers, and the product is then openly 
debated with a view to uncovering problems or identifying desirable 
improvements. 


Experience has shown that modules having a high number of errors per 
1000 lines of code later prove to need disproportionate amounts of testing 
and maintenance. (For modules which have less than 1000 lines, the 
values should be calculated proportionately, e.g. 4 errors in 250 lines = 16 
errors / 1000 lines.) 


Unit Testing 

Unit testing can be made easier by designing modules with a high level of 
cohesion, i.e. having only one function. This makes error identification 
easier, and requires fewer tests. 


There are two main approaches to testing, these are known as: 


e Black box testing — which is not based on any knowledge of internal 
design or code. Tests are based on requirements and functionality. 
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e White box testing — which is based on the knowledge of the internal 
logic of an application’s code. Tests are based on coverage of code 
statements, branches, paths, conditions. 


For a complete software examination, both white box and black box tests 
are required. 


In black box testing, the piece of code being examined is treated as a box 
into which data is input and from which data (and messages) are output, 
to be checked for accuracy. 


The tester is not concerned with what the contents of the box (the code) 
looks like or whether all the lines of code have been executed. The test 
data to be used is extracted from the specification and, as long as the 
coding behaves exactly as it should, then the module has achieved the 
required level of accuracy. We have seen that the achievement of 
absolute accuracy is not practical, thus this method of testing is limited in 
terms of the level of accuracy that can be achieved. It also fails to indicate 
what proportion of the code has actually been executed. In general, it is 
estimated that only 60% of the lines of code are actually tested this way, 
which leaves much of the code in the module untested! 


Thus there are likely to be routines in the code which are not activated 
during testing, because only a sample of data values is input. Untested 
values may cause the module to fail after it has gone live. 


White box testing on the other hand, aims to ensure that every part of the 
program has been activated during testing to confirm that the program is 
working correctly and also to make sure that no /ogic bombs remain. 


Definition: Logic bomb 


A piece of code intentionally inserted into a software system that will 
set off a malicious function when specified conditions are met. For 
example, a programmer may hide a piece of code that starts deleting 
files (such as the salary database) should they ever leave the 
company. 


In white box testing, the module is divided into units consisting of either 
specific lines of code or logical paths, and these are monitored during 
testing to find out which ones have been executed. 


As with black box testing however, it is not possible to test every line or 
every path: a simple 16 line COBOL module can have up to 1,000,000 
possible logical paths. Simply testing each line once would be an 
inadequate sample and leave some statement options, e.g. IF THEN 
ELSE constructs, unchecked. On the other hand, testing all paths would 
be impractical in terms of time. Again, a compromise has to be accepted. 
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A simple measure of test effectiveness is the percentage of statements 
executed. The minimum level acceptable is 85% of all statements. This 
figure is a crude approximation of the amount of code likely to be 
executable (comments and abnormal end routines, for instance, may not 
be actionable). 


Confirmation of the code which has been tested is usually achieved by 
‘instrumenting’ it: adding statements which display a message when 
activated, such as ‘print block 50 executed’. 


Exercise 6.3 [30 minutes] 
What is meant by black box testing? 


How is it different from white box testing? 


Is black or white testing affected by object-oriented design? 


Integration Testing 


Integration testing follows on from unit testing and is concerned with 
testing how modules interface with each other. This is extremely important 
where modules are produced separately by different programmers. 


The aim of integration testing is to identify situations where: 


e data is lost between modules; 
e one module creates a fault in another module; 


e the combination of several modules creates a major undesirable side- 
effect. 


Sometimes, integration testing involves the retesting of modules already 
tested at the unit level. This may occur when a desired result can only be 
achieved by combining more than one program unit, so cannot be tested 
until those units are integrated. 


The amount of integration testing needed depends on the level of unit 
testing carried out. Ideally, the entire unit testing should be completed 
before integration begins so that integration starts with units that contain 
no known faults. 


Each module's interfaces will have been tested in isolation in unit testing, 
but now they should be checked to ensure that all the incoming calls and 
messages are expected and in the correct format, and that all outgoing 
messages and calls are also in the correct format and are being sent to 
the correct destination. 
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There are several options for controlling the way in which modules are 
combined for integration testing. 


The first of these is the number of modules that will be combined at one 
time. 


There are three possible alternatives: 


e big bang — where all modules are integrated at once; 


e phased — where modules are integrated according to their level in the 
design structure, usually several at once; 


e incremental — where modules are added one at a time. 


The big bang approach is usually a desperate measure, when a project is 
running late, as it relies on the ‘hope for the best’ approach. The errors 
revealed are often difficult to associate with a particular module, 
especially in a large system. 


The incremental approach enables new errors revealed during testing to 
be identified easily and is generally the most successful method. It does 
need more individual integration steps however, and if saving time is vital, 
then phased testing may be a more practical option. 


The second consideration is the order in which the units are to be 
combined. This can be: 

e top-down; 

e =bottom-up; 


e sandwich integration. 


A top-down approach begins at the top level and adds modules from the 
levels immediately below, either a group at a time (phased) or one ata 
time (incremental). The top-down approach needs stubs to be added, to 
represent calls to, and replies from, those modules which have not yet 
been integrated. 


Definition: Stub 


A temporary implementation of part of a program for debugging 
purposes. 


For procedure calls, a simple ‘return’ is often all that is needed, 
although in some cases data may be needed as well. Stubs should be 
as simple as possible. 
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Bottom-up testing begins with the lowest level modules and adds new 
modules from successively higher levels. Bottom-up is usually carried out 
using a phased approach rather than an incremental one. It needs dummy 
modules called drivers to execute the low level modules, as the real 
modules that will call and control them, will not yet have been integrated. 


A driver can be as simple as a program instruction to call the module 
being tested, and print any output, or a full test harness simulating input, 
messages etc. 


As both top-down and bottom-up testing have advantages and 
disadvantages, a sandwich approach is often adopted as an alternative, 
especially for very large systems. With this method, top-down integration 
is started at the top level at the same time as bottom-up integration is 
started from the bottom. Somewhere in the middle, the bottom-up 
modules are integrated into the top-down modules as one module. Both 
stubs and drivers are needed for the higher and lower level modules 
respectively, but no modules require both, and at the middle level the last 
modules integrated need neither. 


The method chosen depends on which method’s advantages are 
considered to be most important. Some programmers prefer to develop 
their code one way and integrate the opposite way. Adopting the 
sandwich method avoids having to make a choice. 


System Testing 


System testing aims to check that new programs operate together as a 
working system and conform to the requirements specification. It is 
implemented as a large black box with examples of actual data and 
transactions being used to check that all the functions and features 
conform to the specification. 


There is also end-to-end testing, which is similar to system testing, at the 
‘macro’ end of the test scale. This involves testing of a complete 
application environment in a situation which mimics real-life use, such as 
interacting with a database, using network communications, or interacting 
with other hardware, applications, or systems if appropriate. 


Exercise 6.4 [30 minutes] 


You are a software development team leader responsible for training 
new recruits to the team. You are asked to give a presentation to 


explain to them how your organisation builds information systems and 
tests its software. Write down in note form, the topics and explanations 
you plan to include in your presentation. 
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In addition to the tests described earlier, system testing should include the 
following: 


e Facility checking — Check that each facility included in the 
requirement specification is actually implemented. It can normally be 
performed without the use of a computer. It is sometimes sufficient to 
compare the program objectives with the user documentation. 


e Volume testing — Test the program with large volumes of data. Its 
purpose is to demonstrate that the program can handle the volume of 
data specified in its objectives. 


e Performance testing — Test under pre-specified conditions to assess 
whether system performance can be improved. This can uncover 
situations which will lead to degradation or possible system failure. 


e User interface checking — Check that program interfaces comply with 
the details laid down in the user documentation and the original 
specification. 


e = Security checking — Check that all protection mechanisms guard the 
system from accidental or unauthorised penetration. Companies are 
increasingly concerned about privacy, aS many programs have 
specific security objectives. 


e Recovery testing — Forcing the system to fail in various ways to check 
that a proper recovery can be performed. 


e Error exit checking — Check that each system error message is 
correct and that when an exit takes place, the system is left in a tidy 
state. 


e Help information checking — Ensuring that the help facility for the 
system is adequate for a new user. 


The objective is to ensure that the system meets the prescribed level of 
quality, and that as many errors as possible are uncovered before the 
system goes live. Much effort is required to write and run system test 
suites, as well as to schedule and control them. 


Exercise 6.5 [15 minutes] 


From the list of tests given above, identify which tests help to ensure 
that the system: 


(a) runs fast enough; 


(b) is acceptable to users. 


V1.1 6-15 


Chapter 6 — Testing Programming Methods 


5.4 Acceptance Testing 


This is the process of comparing the program with its initial requirements 
specification and the current needs of its potential users. The description 
of this test is defined in the initial requirements and it includes the form, 
the quantity and the quality of what is to be delivered. Note that 
acceptance testing demonstrates the way the system works. It is not true 
testing, because the purpose is to provide confidence that the delivered 
system meets the initial requirements. 


5.5 Installation Testing 


The purpose of the test is to find installation errors rather than software 
errors. When the program is installed, files or libraries are created, the 
hardware is configured, and the program itself may have to interconnect 
with an operating system. As errors can occur during any of these 
Operations, it is important to identify and correct them in advance. 


6 Desk Checking and Dry Running 


The processes whereby programmers read the programs before they are 
tested by running the code on a computer, are: 


e desk checking; 


e = dry running. 


Study Note 


It is not recommended that programmers desk check their own 
programs, as it is generally accepted that they are not very effective in 


testing their own programs. It is much more difficult to find errors in 
one’s own code than in someone else’s! For this reason, desk checking 
is best performed by a person other than the author of the program 
(e.g. two programmers may swap programs). 


Desk checking aims to check the action or process of a program and 
should be carried out according to the following guidelines: 
e Variables list 


— all variables should be checked, in order to find undeclared and 
incorrect variables. 


e Subroutines, functions 


— each call that is made should be to an existing subroutine or 
function and the parameters should be in the right order and of 
the right number and type. 
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e Constants 


— for all constants, the value (e.g. p = 3.1415), the type (e.g. 
integer, real) and format (e.g. octal, decimal) should be checked. 


e Equation of variables 


— when two variables are equated, the reader should check that the 
data types are consistent and explain any that are not. Note that 
compilers for modern, well-structured languages can do most of 
this automatically. 


The next step after the desk checking of a program design is dry running 
the corresponding code. 


Again, it is useful for a person other than the author to test the code. This 
activity can be considered to be a more detailed level of desk checking, 
where the reader pays close attention to reading the code, rather than 
merely scanning it. Each instruction is examined to find out what it does. 


Dry running a program segment involves the execution of the segment 
with the programmer acting as a computer. The success of the technique 
depends upon the ability to simulate a computer’s action without making 
any assumptions at all. 


Study Note 


Compare this simulation technique with CRC (Class, Responsibility, 
Collaboration) exercises. Do you see similarities in the approach? 


First, a table is drawn with one column for each data variable in the 
program, including any indexes or loop counters. The initial values are 
then entered, exactly as defined by the program. The code is then worked 
through, one command at a time, and its effect is simulated by entering 
new values for any variables that are used. 


Anticipating an effect is the major pitfall, especially in the case of loops 
and conditions. The value of each condition must be carefully worked out 
and the actions and indexes in loops need to be tracked step by step, 
especially at the beginning or the end of the loop. 


This technique can be very time consuming, especially when the logic is 


complicated, but yields the best results when checking by a totally 
independent third party is not possible. 
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7 ~~ The Diagnostic Aids Generated during 
Compilation or Run time 


To be able to determine the cause of problems in the code, programmers 
need good diagnostic skills. They must be able to: 


e make effective use of all the clues provided 


e obtain as many clues as possible. 


Operating systems offer a number of utilities that can be used to debug a 
program. 


While testing a program that generates or changes the contents of a file, 
the programmer must be able to examine the contents. 


In the same way, it helps to be able to examine the contents of selected 
memory variables while the program is running, or at the point where it 
fails. 


Early utilities offered the facility to dump the contents of the file and 
memory in an octal or hexadecimal format, i.e. the characters would be 
printed out without any effort being made to interpret them. The 
programmer had to painstakingly trace through all this code in order to 
obtain the required information. 


Definition: Dump 


A printed version of system memory taken when a system crash has 
occurred. 


By referring to compiler and linker listings, and using octal or hexadecimal 
arithmetic, it was possible first to determine the addresses of the required 
locations and then to interpret the contents. 


7.1 Typical Facilities Available during Interactive 
Debugging 
Nowadays, there are a number of symbolic debuggers and program 
tracers which considerably reduce the effort required in testing. One 


popular utility prints or displays the contents of a file using a general code 
such as ASCII. 
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The trace package offered by most compilers and interpreters allows a 
programmer to watch the program execute by displaying the name of the 
current module or the number of the current statement under execution. 
Some packages also allow the programmer to specify certain variables, 
which are then displayed whenever their values change. 


A debugger package accepts built-in checkpoints where the program 
execution halts (with the program memory left intact) and allows memory 
examination. The memory addresses corresponding to each variable can 
be determined using a cross-reference table provided. The name of the 
variable, of course, is not understood outside the program. 


Most batch operating systems automatically record major events taking 
place on the computer, in a system log file. Information such as files that 
have been opened and/or closed, and the number of file transfers, can be 
found here. Some systems can even log the number of instructions 
executed, together with a list of the last N (typically 16) instructions. This 
type of information can be very useful when trying to reconstruct the 
sequence of events that has led to program failure. 


Today’s fourth generation languages (4GLs) and integrated development 
environments (IDEs) generally provide many more facilities for debugging. 
Along with trace packages, there is a facility for single stepping through 
the program, i.e. executing the program one instruction at a time, and 
examining memory variables as required. 


Definition: IDE 


Integrated Development Environments — programming languages 


integrated within an application, such as BASIC, within the Microsoft 
Office suite of applications. 


A facility called animation displays source code at run time, highlighting 
the current instruction as it is executed. This facility can be combined with 
single stepping to obtain single step with animation. 


AGLs generally allow the variables to be accessed by name if the program 
halts. Not only is the memory saved, but also the variable names can be 
made to retain their meaning when outside the program. 


Perhaps the greatest danger with interactive debugging, is that of the 


unrecorded fix. There is a great temptation, having made the program 
work, to forget to update the documentation. 


6-19 


Chapter 6 — Testing Programming Methods 


6-20 


All that is required is adequate self-discipline. Easy to say, but not so 
easy to achieve! When in an interactive environment, it is important to 
save the latest working version of any program which has been changed, 
and to destroy all out of date versions. Also, when editing source 
programs, it is important that the version number is recorded correctly in 
the source. 


The range of available automated testing and debugging tools has 
increased considerably to date. A popular automated tool is the 
record/playback type. This type allows a tester to click through all the 
combinations of menu choices, dialogue box choices, buttons, etc. in an 
application GUI and have them ‘recorded’ and the results logged. If new 
buttons are added, or some underlying code in the application is changed, 
etc. the application can be retested by just ‘playing back’ the ‘recorded’ 
actions, and comparing the logging results to check the effects of the 
changes. 


The problem with such tools is that if there are continual changes to the 
system being tested, the ‘recordings’ may have to be changed so much 
that it becomes very time-consuming to continuously update the scripts. 
Additionally, interpretation of results (Screens, data, logs, etc.) can be a 
difficult task. Note that there are record/playback tools for text-based 
interfaces also, and for all types of platform. For small projects, the time 
needed to learn and implement them may not be worth it. For larger 
projects, or on-going long-term projects, they can be valuable. 


Other automated tools now available include: 
e code analysers, which monitor code complexity, adherence to 
standards, etc; 


e coverage analysers, which check the parts of the code that have been 
exercised by a test, and may be oriented to code statement coverage, 
condition coverage, path coverage, etc; 


e memory analysers, such as bounds-checkers and leak detectors; 


e load/performance test tools, which test client/server and web 
applications under various load levels; 


e web test tools, which check the validity of links, that the HTML code 
usage Is correct, that client-side and server-side programs work, that 
a website’s interactions are secure; 


e miscellaneous tools for test case management, documentation 
management, bug reporting, and configuration management. 
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Definition: Bounds checker 


A program which sits between the operating system and the application 
being run. It monitors the system calls for bad parameters and bad 
return values, stopping the program when there are signs of trouble. 
Typically, each memory allocation is monitored for over-writes and 
double-frees. If a memory block is not released before the program 
terminates, a warning message is displayed. 


Definition: Memory leak 


A bug in a program which prevents it from freeing up memory which is 
no longer needed. As a result, the program uses more and more 
memory until it finally crashes because there is no more memory left. 


Definition: HTML 


Short for HyperText Markup Language, the authoring language used to 
create documents on the World Wide Web. HTML is similar to SGML 
(Standard Generalised Markup Language), although it is not a strict 
subset. 


HTML defines the structure and layout of a web document by using a 
variety of tags and attributes. 


Definition: Client 


The client part of a client-server architecture. Typically, a client is an 
application which runs on a personal computer or workstation and 
relies On a server to perform some operations. For example, an email 
client is an application which enables you to send and receive email. 


Definition: Server 


A computer or device on a network which manages network resources. 
For example, a file server is a computer and storage device dedicated 
to storing files. 
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Any user on the network can store files on the server. A print server is a 
computer which manages one or more printers and a network server is a 
computer which manages network traffic. A database server is a 
computer system which processes database queries. 


Servers are often dedicated, meaning that they perform no other tasks 
besides their server tasks. In multiprocessing operating systems however, 
a single computer can execute several programs at once. A server in this 
case could refer to the program managing the resources rather than the 
entire computer. 


Exercise 6.6 [60 minutes] 


For classroom discussion. You are about to embark on a software 
project and before you start you must consider which software tools will 
help you with testing your project. What options could you consider? 
What will influence your decision-making? 


Research task: to help this discussion, you should undertake some 


research to find out more about the range of tools available and the 
purposes they serve. 


Exercise 6.7 [25 minutes] 


Compare and contrast static and dynamic ways of testing. What is 
meant by each term, and when may they be used in the software 
development process? Give examples of each approach. 


8 The Problems and Techniques of Program 
Maintenance 


8.1 Problem Solving and Debugging 


All programs require maintenance. When users have a problem it is 
essential to find a solution rapidly. Solving the problem can be achieved 
using a four-stage process: 


1. Understanding the problem: the worst mistake that can be made is 
being unable to understand the problem thoroughly and correctly. 
Understanding the problem involves the following: 


e studying the available facts; 


e investigating what is not known; 
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e §deciding whether more information is needed; 
e identifying contradictory information. 


There may be a lot to examine, but the aim is to obtain an overview of 
the problem. 


2. Devise a plan: before starting to produce a solution, the aim should be 
defined. Next, alternative methods of achieving the aim should be 
considered and the best one chosen. Past experience is often a useful 
guide, e.g. has a similar problem occurred before? Could this help in 
devising a solution to the whole or part of this problem? 


3. Carry out the plan: keep to the plan, and check at each step that the 
problem is being solved as planned. 


4. Review the solution: the solution should be tested to ensure that it 
solves the problem. If it does not, it should be checked to assess 
whether all the relevant information has been used; if it does, then 
details should be kept in case it is useful for solving another problem 
in the future. 


These guidelines can be tailored to debugging (finding and removing) 
software errors as follows: 


Identify the fault: it is useful to think of the malfunction, e.g. a printout 
value, as an error, and the cause in the software, as a fault. Identifying the 
fault is the most important part of debugging; if it is not done correctly, 
then the rest of the debugging process is wasted. This, in fact, often 
happens. If you have approached testing systematically however, proving 
probable accuracy unit by unit, then any error is most likely to be in a new, 
hitherto untested, part of the program. A second person’s ideas can help 
here, just as with checking a design. 


When test data produces incorrect output, it is necessary to trace through 
the logic to determine how and where the program has failed. The values 
in memory variables can be examined by introducing additional print or 
display statements at selected points. A variation of the dry run method 
can then be used to examine the code and see how these values are 
being produced. Statements added for debugging purposes must be 
removed however, after the error has been corrected. 


Stage 1 is to locate the error within the code. The following pointers may 
be helpful: 


e It is important to approach the code with a fresh and open mind, even 
when looking at it for the nth time. Avoid making any assumptions. 
The code must be seen from the viewpoint of an obedient and dumb 
(highly literal) computer. 


e = Only the output from a program is relevant to analysing a failure. The 
error in the output is a direct result of the failure, and however 
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impossible it may seem at first (or even second) glance, the fault must 
exist at a related point in the code. 


e In case of an elusive error, one must keep in mind that an error can 
escalate and emerge at a different point from where it was caused. At 
such times therefore, the total picture has to be re-examined. 


e It sometimes helps to be able to visualise the problem, especially 
when it concerns one of the input or output devices, or a file transfer. 
While considering each statement, one should try to imagine exactly 
what is happening in the computer. 


Stage 2 is to construct a theory for the cause of the fault. All the evidence 
for the fault must be gathered and analysed in order to explain all that has 
been observed. The aim is to verify that what is thought to be at fault, is 
actually at fault. 


Stage 3 is to devise a solution. If stages 1 and 2 have been carried out 
thoroughly, then the solution will probably be obvious by this time. 
However, the solution still has to be tested, otherwise new errors may be 
introduced whilst trying to correct the old ones. 


Use test data to dry run the proposed solution, both the standard test data 
and the data which caused the error to show (which should now become 
part of the standard test data for the unit). As far as is reasonably 
possible, ensure that the change has no side-effects. If you have any 
suspicions about the presence of side-effects, it will save time to check 
them now, rather than later. 


Stage 4 is carrying out the change to the software. This consists of three 
steps, but again a methodical approach is essential, otherwise more 
problems will occur later. The three steps are: 


e Entering in the change — First ensure that the previous versions of 
the source and object code are safe (in case anything goes wrong}). 
Then the source code can be edited, checked and compiled until they 
are correct. Any time needed for testing should be scheduled at this 
stage. 


e Testing — The original data which caused the error to occur should 
now work. Next, the standard unit tests should be run. These should 
also work. If either test fails, then go back to stage 1 and rethink. If the 
software is still in the unit-testing phase, then this type of testing is 
enough. If, however, the testing has reached the integration stage or 
beyond, then additional regression testing at the subsystem or system 
level may be needed. 


e Implementing the change — This should be accomplished using 
standard in-house company procedures. These will vary, but it is 
important to remember to update all the relevant documentation. This 
is especially important for the test cases which have been added. 
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Exercise 6.8 [45 minutes] 


Imagine you are responsible for the maintenance of an existing system 


and a problem occurs e.g. when users enter certain data, an incorrect 
value appears. Describe the process you may adopt in order to identify 
the problem and resolve it successfully. 


9 The Need for Robust and Reliable Software 


The main reason for developing robust and reliable software is to reduce 
the time spent on maintenance after the program has been installed. 


Software maintenance dominates the software lifecycle in terms of effort 
and cost. For a company, this activity is not profitable as it is taking 
valuable and scarce resources away from new development efforts. 
Additionally, the need to change programs and the difficulties of doing so 
are difficult to estimate — the activity can be very time consuming. 


It is also easier to test a program when it is robust and well-structured. 
This is because the testing process is carried out more rapidly and 
therefore more economically. It is important to understand that this phase 
in the software lifecycle of a program does not add value to it, as the 
developer is supposed to provide an error-free program. 


Finally, a software house needs to consider its public image and the effect 
of this on future business. Only by producing robust and reliable software 
will it be able to improve its company image. Robust and reliable software 
is a measure of organisational capability and maturity. 


The CMM (Capability and Maturity Model) defines five levels of maturity 
from level 1 (lowest) to level 5 (highest), in delivering quality software. The 
majority of assessments place companies at level 1, predominantly 
because of inadequate software quality assurance processes, of which 
testing is a substantial part. 
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Summary 


In this chapter we have covered: 


e Documentation of tests. 

e = Levels of testing. 

e Desk checking and dry running. 

e Diagnostic aids during compilation or run time. 
e §=Typical facilities during interactive debugging. 


e Program maintenance, including problem solving. 


You should have a sound appreciation of the importance of planning, 
documenting and carrying out software testing efficiently, whether for 
structured or object-oriented programming. The principles are the same. 


Self Study 


These exercises and self study recommendations are designed to help 
you learn about testing. The exercises consist of: 


e Recommended reading. 
e Internet research on key questions. 


e Review questions. 


Use them as follows: 
Work through the questions and jot down your initial answers. 


All the answers are contained in the chapter text. Go back and review the 
text to check the accuracy of your answers. Where an answer is not 
correct or incomplete, enter the correct answer against the question and 
use this for revision or for retesting at a later date. 
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Self Study 1 [60 minutes 


e What are the key points to grasp in order to understand testing 
processes? 


Why is it advisable to use both normal and abnormal scenarios 
when testing under controlled conditions? 


Why do organisations now regard testing as a serious issue? 
What are the two types of defect? 

Define validation and verification. 

Distinguish between the two. 

What is the difference between static and dynamic testing? 


Why must tests be fully described before they are carried out? 


Self Study 2 


How many levels of testing are there? 

Name them in order from the lowest level up. 

Explain what is involved at each level of testing. 

What do we hope to learn from integration testing? 

What are the three possible alternatives of combining modules? 
Which alternative is preferable and why? 

What is meant by walkthrough? 

Name the order in which modules may be combined. 

What is the objective of system testing? 


What other tests are carried out at system level? 


Why? 
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Self Study 3 
What is meant by desk checking? 
What is meant by dry run? 
How is desk checking carried out? 
Why is dry running time consuming? 
Is it worth doing? 


Why are programmers recommended not to check their own 
programs? 


What should variables be checked for? 


What is meant by single stepping through a program? 


Further Reading 


Any general textbook on software testing. 


Web Research 


Research object-oriented testing processes. Keywords: frameworks. 
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1 Learning Outcomes 


At the end of this chapter you will be able to: 


e Describe alternative methods for designing software; 
e Understand how object-oriented languages aid programming; 


e Describe the use of database management systems and the 
database query language; 


e Use an application program generator; 


e Describe the client/server solution. 


2 Introduction 


Programmers, analysts and program language designers are striving to 
work towards ever greater efficiency and ease in the development of the 
coding needed to produce the complex, quality software we demand 
these days. The progression over time towards higher level, easy to 
understand programming languages is the result of this effort, as is the 
concept of code re-use, commonly quoted as one of the major benefits of 
object-oriented programming. 


This chapter looks at some of the new ways in which the IT industry is 
seeking to improve the speed and quality of software production. They 
include a range of fresh approaches to the software design process, a 
variety of software tools and concepts intended to make the process of 
creating code easier, and the development of tools to write the code for 
you. These all provide an alternative to manual programming. 


It is important to know about these advances as they may influence the 
decisions you make before you set about a programming task, especially 
a major task. 


One reason why these methods are emerging now is that object-oriented 
programming is reaching a critical mass in terms of practitioners, 
languages and applications, and there is now enough code available for 
programmers to consider how to improve upon its usage and how to 
utilise the productivity the technique has long promised. 


The emerging ways of organising the development processes include: 


e The concept of patterns, the idea that some system constructions are 
repeatable from one development circumstance to another. 


e The Unified Development Process, a_ software lifecycle model 
associated with the UML. 


e Re-factoring, the reworking of existing object-oriented code to 
produce more efficient reusable objects. 
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e Design by contract, the idea that quality can be enhanced if software 
objects are viewed as making contracts between one another; this 
imposes obligations and adds rigour to the coding process, thereby 
improving quality. 


Aids to development include: 


e Class libraries, being able to access and use large quantities of 
existing code instead of “reinventing the wheel’. 


e Visual programming, the substitution of graphical styles of 
programming (instead of text driven coding), which is perceived as 
more intuitive and easier to learn. 


e Java Beans, a particular instance of a form of software assembly 
using software components. 


Other tools include: 


e Application generators, which generate code on the basis of 
information provided by users (these tools are now to be found in 
many different guises and are a boon to non-programmers). 


e Database management systems, which let programmers build on and 
adapt common database functions, without needing to start with a 
clean sheet. 


3 Patterns 


This section provides a brief insight into software design patterns, a topic 
of rapidly growing interest in the object-oriented (OO) development 
community. 


Patterns describe ways of doing things — the ‘know-how’ as much as the 
form. The concept of software patterns emerged from the work of an 
architect, Christopher Alexander, who devised a language for encoding 
knowledge of the design and construction of buildings. The knowledge so 
captured is expressed as patterns of both recurring architectural 
arrangements and rules about how and when to apply such findings. 
Researchers in OO adopted the concept and explored how software 
frameworks can be recorded using software design patterns. 


Developers had long recognised that some aspects of software 
construction were repeated in similar forms. Not just objects, but 
compilations of objects amounting to repeatable designs. From the early 
1990s there was a movement to collect and describe these repeating 
designs so that other people could read the patterns and then apply them. 
This gave rise to the idea of the pattern book, a collection of such 
designs; to date, the most famous pattern book to emerge is by ‘ the gang 
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of four’ (Gamma, Helm, Johnson, and Vlissades 1995), which discusses 
23 designs. 


Patterns allow developers to communicate using well-known and 
accepted names for software interactions. Patterns facilitate the transfer 
of know-how, and not just software components, which helps in designing 
reusable software, a difficult and time consuming task. Common design 
patterns can be improved over time, making them more robust than ad- 
hoc designs. 


Forms for pattern documentation vary, but in essence cover: 


the context in which the pattern can be applied; 


any prerequisites that should be satisfied before deciding to use a 
pattern; 


a description of the program structure the pattern will define; 
a list of the participants needed to complete a pattern: 
the outcomes of using a pattern, both advantages and disadvantages; 


examples of the pattern’s use. 


Design patterns: 


provide a common vocabulary; 

explicitly capture expert knowledge; 

improve on developer communications with client or colleagues; 
promote ease of maintenance; 


provide a structure for change. 


work is needed on methodologies on how to define and apply 
patterns; 


they appear simpler to use than is the case and need careful selection 
for a given situation (and this takes skill). 


However, they appear to have considerable potential for the design and 
programming community. 
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Exercise 7.1 [2 hours] 


To expand your knowledge and understanding of patterns, undertake 
further research and reading. 


There are numerous books now available on the topic of software 
patterns, of which the best known is: 


Gamma, E., Helm, R., Johnson. R., and Vlissades, J. (1995). Design 
patterns: Elements of Reusable Object — Oriented Software. Addison 
Wesley, Reading, MA. 


Access this website for the patterns community: 


http://hillside.net/patterns/. 


Here you will find example patterns, leads to other sources of patterns, 
tutorials on the subject and debate between parties on suitable 
definitions of the term ‘patterns’ etc. 


Browse this site thoroughly to obtain an overview of patterns. 


4 The Unified Process 


An in-depth explanation of the Unified Process (UP) is beyond the scope 
of this course; the intention is to present the key concepts in order to 
make you aware, as a programmer, of the potential of a particular way of 
working. The Unified Process is promoted as the lifecycle model to use 
with the Unified Modelling Language (UML), with which there is a strong 
association. Its main features are: 


e = It is strongly iterative and incremental; 


e It takes account of risk and addresses it from the start. 


The Unified Process is based on iterations which either address different 
aspects of the design process or move the design forward in some way 
(this is the incremental aspect of the model). Prototypes are used to 
explore some aspect of the design, but it is not based on rapid 
prototyping. The whole ethos of UP is to break activities down into small 
progressive steps, rather than complete the design task in one go (as in 
the Waterfall model, for example). 

The approach is to: 

e plan; 

e specify, design and implement; 

e integrate, test and run; 


e obtain feedback before next iteration. 
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Risk therefore can be identified and managed professionally from the 
project start. Where there is uncertainty about the software design or 
complexity, it can be addressed by investigating, building and testing early 
on, rather than later in a lifecycle. Once the outcome of the problem is 
known, the process moves to the next iteration. This iteration and 
incremental approach means that different groups can work on different 
things at the same time. 


The UP is linked to use cases because they help to: 


e identify the users of the system and their requirements; 


e produce the definition of test cases and procedures; 


direct the planning of iteration; 


guide the creation of user documentation. 


The UP actually covers the whole of the software development lifecycle, 
from business case to long term maintenance. There are four main 
phases: 


e Inception, in which the business case is developed and the project 
scope defined. 


e Elaboration, which is concerned with the functional requirements of 
the proposed system. 


e Construction, which involves completion of system analysis, design 
and implementation i.e. the building of the product. 


e = Transition, in which the system is deployed to the target users and 
maintained. This phase also includes system conversions and user 
training. 


Within each phase there are five workflows. The five workflows are: 
e Requirements — identification of the functional and non-functional 
requirements in the form of use cases. 


e Analysis — iterative refinement of requirements towards system 
requirements. 


e Design — production of the detailed design for implementation. 


e Implementation — coding, completion, packaging and documenting of 
product. 


e Testing — testing the software for the full range of capabilities. 


The five workflows occur in each of the four phases. The model breaks 
down further to cover activities within each workflow. Activities take inputs 
and produce outputs, known as artefacts. 
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Exercise 7.2 [2 hours] 


To gain a broader understanding of the Unified Process, access the 


website of the Object Management Group http://www.omg.org/ or 
Rational, the developers of the Unified Process concept 


http://www.rational.com/. 


Compare and contrast this model with other lifecycle models you know 
e.g. Waterfall, Spiral, and consider the following questions: 


In what respects is this process similar? 
In what respects does UP differ? 
Available literature on the Unified Process is currently limited but is 


likely to increase. See what you can find from your academic 
resources. 


Your primary objective is to understand the relationship between the 
phases and workflows of the Unified Process and how iterations work 
within each phase. 


5 Re-factoring 


As the application of object technology, especially Java, has become 
commonplace, a new problem has arisen for the software development 
community. Significant numbers of inadequately designed and 
implemented programs have been created by less experienced 
developers, resulting in applications which are inefficient and hard to 
maintain and extend. Increasingly, software system professionals are 
discovering just how difficult it is to work with these inherited, non-optimal 
programs. 


Object-oriented languages are accredited for promoting software reuse. 
However, object-oriented software is usually not reusable when it is 
written the first time. Thus, object-oriented class libraries require frequent 
revisions before they evolve to stable, reusable libraries or frameworks. 


Re-factoring is the process of taking an object design and rearranging it to 
make the design more flexible and/or reusable. It is a technique to 
improve the quality of existing code, to make the design more efficient 
and easier to maintain. 


Definition: Re-factoring 


Re-factoring is a program transformation that reorganises a program 
without changing its behaviour. 
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Re-factoring works by applying a series of small steps, each of which 
changes the internal structure of the code while maintaining its external 
behaviour. For example, if at any stage you become aware that an 
Operation does not fit properly in the class where it is (i. e. the 
responsibilities are not allocated to the classes in the best possible 
way),then a re-factoring step would be needed to make the necessary 
changes to place the operation where it should be, updating whatever 
code and design documentation needs to be modified. 


A program may run correctly, but be poorly structured. Re-factoring 
improves its structure, making it easier to maintain and extend. 


If there are two classes with overlapping responsibilities and behaviour, 
you can factor out the common behaviour into a new super-class from 
which both inherit. In fact, this process can help with one of the hardest 
aspects of object-oriented analysis and design, namely, finding the right 
abstractions to make the design clean and robust. 


A common method of re-factoring is to re-factor along inheritance lines. 
For instance, suppose that in a design review you discover that two 
classes in your system which do not share a common super-class both 
implement very similar or identical behaviour. It would be advantageous to 
re-factor these two classes by moving the common behaviour into a 
shared super-class, and then changing the classes so that they descend 
from that class. You can also re-factor by simply moving methods from 
concrete subclasses up the hierarchy to more abstract super-classes, as 
you see the need. 


You can also re-factor along composition lines. If you find that a class is 
implementing two different sets of responsibilities which do not interact 
much, or which use two subsets of the attributes of the original class, you 
may want to re-factor that class into two different classes, one of which 
perhaps contains the other. 


It is advisable to re-factor in small steps to avoid introducing errors or 
becoming confused in the process. 


Therefore: 


e Find the smallest useful change you can make. 
e = Make it. 
e Test the system. 


This should save you a lot of debugging time, but make sure you have 
good tests in place before you begin re-factoring. 


Although this is a slow and time consuming process which may cost more 
in the short term, the longer term costs are greatly reduced. 


7-9 


Chapter 7 — Alternative Methods Programming Methods 


6 Design by Contract 


Design by Contract is a method designed to improve software quality. 
According to the Design by Contract theory, a software system is viewed 
as a set of communicating components whose interaction is based on 
precisely defined specifications of their mutual obligations, i.e. contracts. 


In reality, contracts exist between two parties when one of them (the 
supplier) carries out a task for the other (the client). Each party expects 
benefits and accepts obligations within the contract, and these are 
specified in the written document. The constraints, i.e. what each side is 
expected or not expected to do, are explicit. In other words, conditions are 
set and have to be met to ensure satisfactory implementation of the 
contract. 


The same ideas can be applied to software. The Design by Contract 
theory proposes associating a specification with every software element. 
These specifications (or contracts) govern the interaction of the element 
with the rest of the world. 


Design by Contract works via the assertion, a Boolean statement that 
should never be false and, therefore, will only be false because of a bug. 
Assertions take the form of: 


e  Pre-conditions. 
e Post conditions. 


e = Invariants. 


Pre- and post conditions apply to operations. 


A pre-condition describes something that must be true when an operation 
is invoked. 


A post condition describes something that must be true when the 
operation returns. 


An invariant is an assertion about a class. The invariant is always true for 
all instances of the class, that is to say, whenever the object is available 
to have an operation invoked. 


Design by Contract provides a rigorous definition of an operation’s 
purpose and a class’s legal state. Encoding these assertions into classes 
will enhance debugging, but not all OO languages support assertions, 
only Eiffel (the language developed by Bertrand Meyer - see below) 
supports assertions as part of its language, but mechanisms can be 
added to the languages if so required. Other OO languages such as C 
and Java have third party Design by Contract. 
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Exercise 7.3 [60 minutes] 


The main proponent of Design by Contract is Bertrand Meyer: it was he 
and his company who evolved the concept. Note that “Design by 
Contract” is a trademark of Interactive Software Engineering. His work 


is posted on the company website. Access the site (www.eiffel.com) 
and read his statements on Design by Contract. In particular, consider 
the examples used in the explanation. Make sure you understand fully 
how the examples work — in software terms, in what way is the contract 
fulfilled? What is the mechanism? 


7 Class Libraries 


Definition: Program library (software library) 


A collection of programs and packages that are made available for 
common use within an environment; individual items need not be 
related. A typical library may contain compilers, utility programs, 
packages for mathematical operation etc. Usually it is only necessary 
to reference the library program to cause it to be automatically 
incorporated in a user’s program, e.g. DLL. 


Definition: Dynamic Link Library 


A file of procedures residing on disk which is available to an executing 
program so that relevant procedures can be read into memory and 
executed at run time. The advantage is that the executables are 
smaller, the link libraries can be shared, and, providing the interface 
remains unchanged, can be updated without recompiling the 
application. Although extra time is spent in disk input/output, disk 
caching and faster disk subsystems make this a valuable technique. 


The concept of a library has long been used in programming and, as 
these definitions show, in different contexts. Class libraries refer expressly 
to objects created by OO programmers and gathered into classes for 
general (though not necessarily free) dissemination. 


A Class is made up of member objects of that class, which in turn may be 
part of a package of classes, which in turn may be part of a collection. 
Collections can be readily searched to find objects, relationships, and 
implementations. 


Access to third party code allows developers to be more productive 
because they do not have to start programming from scratch. In popular 
languages, such as Java, the volume of code continues to grow. For 
example, in the first release of the Java 2 Enterprise Edition platform, 
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there was a total of 357 public classes and interfaces within 21 packages. 
When v.1.3 was released the number of classes had grown to over 2,100 
public classes and interfaces within over 70 packages. 


The disadvantage is that finding just the right code for your purposes from 
such a large selection can be a time-consuming task, to the extent that 
experienced programmers sometimes prefer to write the code rather than 
look for it — it is quicker for them. To help with the search and 
management tasks, directories of class libraries are being published 
alongside large scale visual maps (posters) showing classes and their 
relationships in diagrammatic form. 


These directories are specifically designed to make it possible to find 
information on any class or member quickly and easily. These reference 
books contain class descriptions which may include: 


e Class hierarchy diagrams showing connections to related classes. 
e Detailed overviews describing purpose and key concepts. 

e Member descriptions and member groupings. 

e Examples of the classes in real-life contexts. 


e Detailed descriptions and examples for each class member. 


You will find numerous examples and code included in these books. 


Major software providers make software libraries available and these can 
be found on the Internet. 


As well as libraries for general programming languages, there are libraries 
for specific types of application. Observe the graphical library OpenGL, 
now the premier environment for developing portable, interactive 2D and 
3D graphics applications. 


OpenGL applications are to be found in computer aided design, content 
creation, energy, entertainment, game development, manufacturing, 
medical, virtual reality and more. Originally a proprietary standard 
developed by SGI (Silicon Graphics), the library opened in 1992 and is 
now the de facto industry standard for the production of sophisticated 
graphics. 
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Exercise 7.4 [60 minutes] 


You are a programmer, as part of a team, and about to start a 
development project. Your team leader has asked you to advise the 
team on using class libraries to speed up the development process. 


What consideration will you give to using class libraries as part of your 
approach? What do you see as the benefits? What do you think may 
be the disadvantages? 


What information do you think you need to know before you can advise 
the team? How will you find out more about the libraries available? 
What resources will you use to help you reach a decision? 


Write down what you think are the key aspects to consider about class 
libraries. 


Find examples of class libraries on the Internet or via software 
suppliers and discover what these libraries contain. 


Visual Programming 


What is a Visual Programming Language (VPL)? In its simplest form, it 
can be defined as a language which allows programming with visual 
expressions (Such as graphics, drawings, animation or icons). VPLs are 
programming languages where programming is accomplished using 
visual techniques to express relationships among or transformations to 
data. 


Such visual techniques include: 

e = Sketching. 

e = Pointing. 

e Demonstrating via direct manipulation. 


Visual programming languages may be further classified according to the 
type and extent of visual expression used, into: 


e Icon-based languages. 
e Form-based languages. 


e Diagram languages. 


Visual programming environments provide graphical or iconic elements 
which can be manipulated by the user in an interactive way according to a 
specific spatial gjammar for program construction. A VPL must be able to 
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accomplish all its programming tasks in a visual manner, without resorting 
to an alternate textual representation. 


A VPL is not a visual environment for programming. The distinction is that 
Visual Basic and the entire Microsoft Visual™ family are not, despite their 
names, visual programming languages. They are textual languages which 
use a graphical GUI builder to make programming interfaces easier for 
the programmer. The user interface portion of the language is visual, the 
rest is not. The programmer must resort to the underlying textual 
language to define new objects. 


VPLs offer a number of advantages over traditional textual programming 
languages. Most VPLs offer at least one, if not all, of the following four 
advantages: 


e Fewer programming concepts -— the programmer is not required to 
know about confusing programming concepts, e.g. scope, storage 
allocation, pointers. 


e Concreteness — the programming process is more concrete because 
of the ability to directly manipulate objects. For example, a stack can 
have a visual representation that shows its structure and data values. 


e Explicit depiction of relationships — there may be some visual 
representation that allows the user to see relationships between 
different objects, e.g. a constraint or dataflow diagram. 


e Immediate visual feedback - the user’s actions in a_ visual 
programming environment have immediate consequences that are 
reflected in a visual manner. For example, updating a data value 
causes anything depending on that value to be re-evaluated. This has 
obvious implications for both debugging and the explicitness of a 
program. 


Note though, that text still has its place in visual programming, particularly 
for certain forms of documentation, e.g: 


e Naming, to distinguish between elements which are of the same kind; 


e Expressing well-known and compact concepts which are inherently 
textual, e.g. algebraic formulae. 


This model can be implemented as: 


e parameter passing, function invocation, as in traditional textual 
languages, or as 


e message passing, as in object-oriented languages. 


At first glance, most control structures in VPLs do not look anything like 
the control structures of traditional textual programming languages. 
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However, they serve the same purpose: to provide a means of directing 
the flow of control through a program, for example, through iterative or 
parallel execution. 


A range of visual programming languages is commercially available 
today, which covers a variety of sectors including tools for general 
purpose, multimedia and computer based training, data analysis and 
visualisation, data acquisition and design and testing. 


Some well known VPLs include LabVIEW from National Instruments, 
mainly for data acquisition, Authorware Macromedia for computer based 
learning and Internet development, and VisualAge from IBM. 


Exercise 7.5 [ 60 minutes] 


Research examples of visual programming languages, either through 
academic resources e.g. the library, or via the Internet. Much material 
is available in the form of books, manuals, websites, demonstrator 
versions and supplier literature to give you an idea of what visual 
programming is like. Note that there are different styles; you should 
compare and contrast the information you read. For example, 
Authorware uses a data flowline approach, but LabVIEW uses block 
diagrams and wires components together. Establish what these 
approaches mean, how they are similar and how they are different. 


Do you think there are advantages to visual programming? If so, what 
are they? Are there disadvantages? If so, what are they? 


Compare and contrast ‘pure’ visual programming languages with GUI 
builders such as VisualBasic and Delphi. Do you agree there is a 
difference in the approach and the way the languages are constructed? 


9 Java Beans 


A Java bean is a component coded to a specific standard, which is 
designed to be used within an application or an applet. 


The concept of Java beans and the development environments best 
suited to beans, combines a number of the concepts described so far. 
Beans are reusable chunks of code which can be manipulated by visual 
application builder tools and assembled into applications, frequently 
without the need to write a single line of code. 


Beans are differentiated from typical Java classes by introspection. This 
means that tools enabled to recognise predefined patterns in method 
signatures and class definitions can ‘look inside’ a bean to determine its 
properties and behaviour. A bean can then be manipulated, at what is 
referred to as design time, in contrast to run time, when it is being 
assembled as a part within a larger application. 
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However, beans must follow a certain pattern known as the design 
signature in order for introspection tools to recognise how beans can be 
manipulated, both at design time, and run time. In effect, they publish 
their attributes and behaviours through special method signature patterns 
which are recognised by beans-aware application construction tools. 


Beans are tremendously useful, especially for programmers who consider 
themselves domain experts (financial analysts, scientists, linguists, bank 
loan officers, investment analysts, factory process experts), rather than 
systems programmers. 


Although Java beans will vary in functionality, most share certain common 
defining features: 


e Support for introspection to allow a builder tool to analyse how a bean 
works. 


e Support for customisation, allowing a user to alter the appearance 
and behaviour of a bean. 


e Support for events which allow beans to fire events, and informing 
builder tools about both the events they can fire, and the events they 
can handle. 


e Support for properties which allow beans to be manipulated 
programmatically, as well as support the customisation mentioned 
above. 


e Support for persistence, so that beans which have been customised 
in an application builder can have their state saved and restored. 
Typically, persistence is used with an application builder's save and 
load menu commands to restore any work that has gone into 
constructing an application. 


Java beans were originally designed for user-interface and client-side 
tasks, but with the release of the Enterprise Java Beans specification, 
there are now server-side features available. 


10 Application Program Generators 


An applications program generator is a software system which produces a 
computer program in response to a user’s needs. The system comprises 
a set of pre-coded modules which perform different functions. Users 
specify what they require; the applications generator determines how to 
perform the tasks and produces the instructions for the program. Thus the 
language works at a higher level than normal high-level languages such 
as COBOL, Pascal or C, and thus an applications program generator is 
classified as a fourth-generation language (4GL). 


AGLs are non-procedural languages, i.e. they allow users to specify what 
a program must do, rather than how to do it. 4GLs are intended for 
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interactive, online operation; commands and messages are in simple 
English-like sentences and many offer a facility for menu-driven operation. 
Thus, they may be called user-friendly. 


A AGL is not straightforward to define because there are a number of 
different types and the producers of such systems define each one 
differently. Fourth generation languages differ widely in that they tend to 
be developed for different specific domains. Thus some will be aimed at 
database applications, some will have been designed for use with 
Graphical User Interfaces, while others focus completely on processing 
data and generating reports. 


Some are designed to be used by a programming specialist who uses the 
general set of software tools to build a particular application system, while 
others are designed to be used by the end-user, i.e. a person who has 
little programming experience. 


Recent developments in application generators for non-programmers are 
in the area of user interface applications, particularly for hypertext 
applications to access information on the World Wide Web. Thus the 
term application generator is also used for simpler software products 
which provide a flexible means of tailoring a general software package to 
deal with particular situations. This is achieved by defining parameters to 
be used by the application generator. Perhaps a better term for these 
would be ‘end-user application generators’, but this is not a definition. 


It is probably easier to examine the characteristics of a fourth generation 
language before attempting a definition. Essentially, they are easier to use 
than third generation languages, both for programmers and end-users. 
They can be summarised as having the following characteristics: 

e Used in a particular domain. 

e Online operation. 

e User-friendly. 

e =6Very high-level. 


e Non-procedural code, i.e. the user specifies what to do rather than 
how to do it. 


Features of a 4GL 


A fourth generation language will provide ways for the user (the person or 
programmer who Is building the application) to do such things as: 


e define the data in terms of input and validation and the processing 
which must be performed on the data; 


e define the required output in terms of the format of data and the layout 
of the printed report or screen; 
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e define the processing required, which can include the user’s 
interaction with screen information, e.g. screen based forms in a 
database, user queries or navigation through a _ multimedia 
application; 


e select combinations of standard processing operations. 


Definition: fourth generation language (4GL) 


A term used by the data processing community for a_ high-level 
language which is designed to allow users who are not trained 
programmers to develop applications, in particular for querying 
databases and generating reports. 4GLs are usually non-procedural 
languages in which the user describes what is wanted in terms of 
application, not the computer. The processor takes the user’s 
description and either interprets it directly or generates a program (ina 
database query language or COBOL) which will perform the desired 
operation. For this reason, the latter are sometimes called application 
generators. 


Development of Application Generators 
There are two main areas in which development has taken place: 


e The domains in which the created application will work. 


e The type of user of the application program generator, initially being 
programmers and more recently, end-users who have no or little 
programming experience. 


Report generators are an example of a very early application generator, 
although they are now considered to be a fourth generation tool, as are 
query languages. Report generators were designed to read and process 
files and produce reports with the facility to provide totals and sub-totals. 
Report Program Generators (RPGs) are now used to create complete 
systems, not just reports, but in the 1960s they were used by 
programmers as an alternative and quick way to produce reports. 


The report generation tools now available allow users (not necessarily 
programmers) to specify what should be in the report and how it should 
look. Some are similar to the simple extract and print — others are much 
more powerful, and therefore more flexible. The more powerful report 
generators are effectively very high-level languages which are easy to 
learn and easy to use correctly, because they are formalised and 
restricted in the way that the commands can be used. 


As program development tools evolved and data processing changed, 
packages became available which allowed a user to define a file 
interactively, and which would then handle the standard requirements of 
file creation, file manipulation, file interrogation, and file amendment. 
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These were combined with packages which accepted a report format from 
the user and used the created file to produce a printed report. One 
example of this is a package to create a relational database with a query 
language incorporated. These are addressed in the next section. To allow 
the user to define more complex tasks using the files created, very high- 
level commands were incorporated into these packages. The package 
then became capable of handling most applications, without the need to 
understand complex programming techniques. 


Some database software contains wizards to help the non-programmer 
use the package. The following screen shots have been taken when using 
Microsoft Access. The use of the report wizard will illustrate, at a very 
simple level, the difference between the user interaction and user input 
which generates the code and the effort involved in writing the program. 
The report wizard provides the facilities for a user to choose from a menu 
and select options by clicking. Some of the choices offered are limited, 
e.g. only three standard report layouts are provided. 


The example chosen to illustrate the concept of a fourth generation 
language is a file of names and addresses. Having selected a file of data 
(referred to as a table in the database management software), the user is 
presented with all the fields and asked to select the fields which are to 
appear on the report in the order in which they are to appear. Figure 7.1 
shows that the fields which have been selected have been moved from 
the left hand side to the right hand side. The user is able to rectify 
mistakes or change his/her mind in the process. 


Report Wizard 


| Which fields do you want on your report? 


FA Be) ‘You can choose from more than one table or query. 


Tables/Queries: 
Table: Addresses 7 


Available Fields: Selected Fields: 


AddressID 
AddressN ame 
StateOrProvince 


EmailAddress 
HomePhone 
WorkPhone 
WorkExtension 


Figure 7.1 Choosing fields to appear in the report 


The next stage is to decide if any of the fields need to be grouped. For 
example, we could have chosen to group by city; this means that the 
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report would show each city as a side heading and all the addresses for 
that city appear below the appropriate side heading. There is also an 
option to provide totals. For example, assume a customer orders’ table 
contains fields such as product code, item code, number ordered and unit 
price. The report of the orders may be required to be grouped by product 
code, thus all the items ordered of a particular type of product would be 
listed under the side heading of the product code. Totals of the number 
ordered may also be required. The report wizard provides summary 
options for the user to choose and these include the facility to provide 
totals at the end of each group. Grouping was not chosen in this example. 


The next stage is to decide how the information in the report will be 
sorted. For example, the names and addresses may be required in 
LastName order or City order. Figure 7.2 shows the dialogue window with 
the choice of LastName within City. The choice of fields will appear in a 
drop down menu when the triangle on the right hand side of the data entry 
box is clicked. 


What sort order do you want for your records? 


‘You can sort records by up to four fields, in either 
ascending or descending order. 


1 [City a au 
Z [LastName x] a 


(None) 
FirstName A | 
LastName 


PostalCode 


Cancel | < Back Finish | 


Figure 7.2 Providing sort information to the report wizard 


The report layout is the next step. This is shown in Figure 7.3. You will 
see that only the standard layouts are on offer. When the different layouts 
are chosen, the diagram of the layout will change in the left hand box. 
Again, this is an example of user-friendliness. 
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How would you like to lay out your report? 


Layout Orientation 

™ Columnar @ Portrait 
KXKX XKKX XXXX XXX XKKK © Landscape 
RRXEK XKKEK XKKEX KXKEX KEKE c se 
RHKEK RKKEK KRKEX KRKEK REREK Justified ry 
RRKEK RKKEK RRKEX RRKKK REREK 
MRMEK OMKKEK RRKEM ORRKKK REREE 


MARR RRXEK RARER RRRKK XEREK 
RRRAA RRA XRRKR RARER RARER 
RRMMA RRRMK RARER XAKEK RARER 
RRMA RRAAK XARKK RARER RARER 
RRMAA RRRKK XARAK XAXKER RARER 
RRMMA ARAM XARA RARER RAXEK 
RRRMK RRM XARKK RXR XXXEK 


IV Adjust the field width so all fields 
fit on a page. 


Cancel | < Back Finish | 


Figure 7.3 Choosing the report layout in the report wizard 


The style of the report is now chosen. This involves for example, the use 
of different fonts and sizes for the letters, and choosing the colour of the 
lines and letters. The dialogue box is shown in Figure 7.4. 


What style would you like? 


XEXX XXXX 
HRRRK RXR 


Title 


Label from Detail 


Control from Detail 


Cancel | < Back Einish | 


Figure 7.4 Choosing the style of the report in the report wizard. 


The report is shown in preview for the user to accept or change. The user 
who created this report has not had to write any SQL code. The user 
communicated by interacting with the wizard, making choices from lists 
and menus provided by the wizard. Thus the language used is much 
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closer to English than that used in third generation programming 
languages such as COBOL and C. 


To summarise, a report generator would provide: 


e = =©multi-file input; 

e calculations on fields; 

e control of printing; 

e = sub-totalling and layout; 


e invocation of library routines. 


End users, as well as programmers, can often make effective use of this 
type of generator, which then releases programming effort for the more 
complex programming jobs. For example, the only commands which a 
generator needs are parameters to define the relevant aspects of the 
processing — e.g. which file to use, which records, which fields, which type 
of report, what style, what order etc. 


Advantages of using a report generator are: 


e = It will save time, enabling the straightforward jobs to be completed 
quickly and thus leave more time for the complicated ones. 


e Report generator programs are very easy to modify, both by 
programmers and end-users. They are clear, logical, readable and 
hence easy to maintain. 


However, no report generator can be as powerful or as comprehensive as 
a very well developed language like C++ or COBOL, but for the routine 
extraction of data, and the formatting of data files or reports from it, a 
report generator may be perfectly acceptable. 


Some microcomputer software can also be used to create specialised 
applications — in other words, to create new software. Microcomputer 
software packages which fall into this category include many spreadsheet 
programs. Senior managers can bring data and information together from 
different sources and manipulate it in new ways: 


e To make projections. 


e Conduct what if analyses. 


e Make long-term planning decisions. 


In a business without computers, tasks such as identifying overdue 
accounts receivable, calculating how long they have been overdue and 
the penalty to be incurred, can take hours of work. However, with an 
electronic spreadsheet package, the user can create an application in less 
than half an hour, which will calculate accounts receivable automatically, 
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and the application can be used many times. Spreadsheets also contain 
macros, which allow the user to program to a limited extent. 


Application Generators 


These are different from the above in that both report generators and 
query languages only allow the user to specify output and input-related 
tasks, whereas application generators allow users to specify a complete 
software application — a program in fact, with the usual format of input, 
validation of data, process (both logic, computational and functional), and 
output (usually in the form of reports or screen displays.) Their advantage 
is that they allow users to reduce the time it takes to produce a working 
system. 


Application generators accept the specification for a program in a 
computer-usable form. This specification file is read into the generator 
itself which determines how to perform the tasks and produces the 
instructions for the computer. As with query languages and report 
generators, the user of an application generator does not need to specify 
exactly how processing tasks are to be performed. Examples of this are 
authoring software and software to produce websites. 


Definition: application generators 


A program, a software tool, which is capable of creating a range of 
application programs in a particular domain. The generated program 
will be configured by information provided by the person using the 
application generator. Domains in which application generators are 
frequently encountered include simulation, process control and user 
interface software. See also fourth generation language. 


Authoring software is used to create multimedia applications which are 
provided on CDROM and initially were used to create prototypes. The 
actual applications would then be developed using a programming 
language which could handle graphic, video and sound files. Further 
development has resulted in the authoring software being used for 
creating applications. Some authoring software can be combined with 
programming languages (e.g. Authorware and C++) or have a 
programming language integrated (e.g. Macromedia Director and Lingo — 
an object-oriented programming language). Both packages can be used 
by both end-users and programmers (who would be able to use the 
programming language). 


A further development has been in the area of applications for the World 
Wide Web. Initially, websites were developed using HTML (hypertext 
mark up language) which strictly speaking, is not a programming 
language. As sites became more interactive, e.g. icons changing as a 
user rolled the mouse over it, then JavaScript was needed in addition to 
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HTML. When websites started being linked to databases, the Java 
programming language was needed. 


Software became available to present the web page to the user in a 
WYSIWYG _ format (What You See Is What You Get); the user would 
enter words and graphic images just as you would for a word processing 
package, and the software generated the HTML code. 


As the developments outlined above have occurred, these software 
packages have been upgraded to allow the user to add behaviours (things 
happening) and the necessary JavaScript coding is inserted, in addition to 
the HTML. This type of software ranges in complexity and caters for the 
end-user, multimedia developer or programmer. Two examples are 
Microsoft Front Page and Macromedia Dreamweaver. 


Definition: user-friendly 
A qualitative term applied to interactive systems (hardware plus 
software) which are designed to make the user’s task as easy as 
possible by providing feedback. Ways in which a system can be made 
user-friendly include: 
e list of valid commands available on request; 

use of a graphical user interface; 


ability to undo actions made in error or by accident; 


use of graphics and colour to indicate activity; 


availability of a help system giving information appropriate to the 
current situation; 


choice of interaction methods to suit personal preference and level 
of expertise; 


immediate verification of data input, such as checking that a 
number is in the correct range or by word-by-word spell checking. 


Inadequacies of the Fourth Generation 


During the period 1975-1990, Fourth Generation products made 
considerable progress in addressing some of the weaknesses of previous 
generations, and have come to be used for a considerable proportion of 
application software development (e.g. Sumner & Benson, 1988). Despite 
their promise, however, (although in part because of it), '4GLs' have been 
subjected to widespread criticism. The following table indicates the 
reasons. 
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Development | Execution Prototyping | Development Time Execution Time 
Response Machine Response | Machine 
Speed Efficiency | Speed Efficiency 
Program Executable | Very slow Very slow Very low Very high | Very high 
Generator Code 
Program p-Code and | Very slow Very slow Very low High High 
Generator Run-time 
Interpreter 
Application Executable | Slow Slow Low Very high | Very high 
Generator Code 
Application p-Code and | Slow Slow Low High High 
generator Run-time 
Interpreter 
Application Tables and | Slow Very high Very high High Low 
Generator Run-time 
Table 
Full Fully Very fast Very high Very low Low Very low 
Interpretive Interpretive 
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Exercise 7.6 


Exercise 7.7 


Refer to the definition of user-friendly. 
programs you are writing. What additional operations would you 
include if you were writing a program to accept user input and were 
asked to make the program user-friendly? 


[20 minutes] 


What is the difference between 4GL tools such as a report generator 
and query language and an application generator? 


[20 minutes] 


This could be applied to 
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11 DBMS and the Database Query Language 


11.1 Introduction 


This section is concerned with both the use of a Database Management 
System (DBMS) to create databases, and the programming language 
used to access and update databases, the Database Query language. 
You will not be expected to write programs in query language, merely to 
understand the underlying concepts of a database and how query 
languages differ from the procedural type of programming covered so far. 
Query languages were mentioned in the section concerned with 
application generators and they were identified as a fourth generation 
tool. Background is provided by looking at the information requirements 
of an organisation and then identifying how useful databases are in 
addressing this. 


11.2 Organisation of Data 


In any organisation, different parts of the same data are required by 
different departments. For example, both the personnel department and 
finance department requires information about employees. Some of this 
data e.g. employee number, employee name and address would be 
required by both departments. However, the finance department is not 
concerned with the employee’s post and grade or review information and 
the personnel department is not concerned with data such as year to date 
gross pay and tax paid to date. If each department creates files for its own 
use, multiple copies of some of the data will be produced. This is called 
data redundancy. 


If an employee changed address, then both departments would need to 
be notified and both files would need to be updated. There is thus the 
danger that the information held by the organisation is inaccurate and 
inconsistent, i.e. different. In this case, the data could be said to lack data 
integrity. 


If a department wishes to modify its software system to enhance or 
update its facilities, the data files often need to be modified too. For 
example, if a review system is to be implemented, the personnel 
department may want to produce information in the form of reports. In 
order to do this they would have to store data concerning the review in the 
file. This would mean that the record layout will need to be changed to 
add the extra fields and every program which accesses this file will need 
to be changed, as programs are data dependent. 


Another problem arises if personnel are required to print a report for 
management to provide information concerning possible redundancies. In 
addition to details concerning the employee’s personal details such as 
age and start date, management also require details concerning job 
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description, grade and salary. But the salary data is not held on the 
personnel file. There are two possible solutions. 


e The first involves printing two reports of all employees in employee 
name order, one from the finance department showing name and 
salary and one from the personnel department showing the rest of the 
required information. The manager would then have to look at two 
reports at the same time. 


e The other solution would be to write a one-off program which 
accessed both files and printed the report. If the files were physically 
held in different departments, then copies of the files would be 
needed and stored in one place so the program could access them. 


These are both extremely clumsy solutions to the problem. The file 
handling described is an example of a File Management system. 


File Management Systems (FMS) 


Computers were first used commercially in 1954, when the General 
Electric Company purchased a UNIVAC (Universal Automatic Computer) 
for its research division. At first, the processing performed was 
straightforward. Data was usually organised sequentially and stored ina 
single file on magnetic tape, which contained all the elements of data 
required for processing. 


The term file management system was introduced to describe this 
traditional approach to managing business data and_ information. 
However, file management systems did not provide users with an easy 
way to group records within a file, or to establish relationships among the 
records in different files. 


As disk storage became cheaper and its capacity grew, new software 
applications were developed to access disk-based files. The need to 
access data stored in more than one file was quickly recognised and 
prompted more complex programming requirements. 


To deal with these problems and the ever-growing demands for a flexible, 
easy-to-use mechanism for managing data, the concept of a database 
was developed. 


Definition: file management system 


A software system which provides facilities for file management (often 
specifically of data files) at a level above that offered by operating 
systems (but in the case of data files, below that offered by database 
management systems). 
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Limitations of File Management Systems 


e Lack of data integration leading to data redundancy — The same data 
elements appear in many different files and often in different formats. 
This makes updating files difficult, time consuming, and error prone. 


e Lack of data integrity — In addition to wasted space, data redundancy 
creates a problem when it comes to file updating. When an element of 
data needs to be changed — for example an employee address, it 
must be updated in all the files, which is a tedious procedure. If some 
files are missed, or not updated at the same time, data will be 
inconsistent — that is, data integrity is not maintained — and reports will 
be produced with incorrect information. 


e Data dependence of programs -— If the data structure is changed, then 
programs will need to be changed (or modified). If programs are 
changed, then often the data structure will need to be modified. (Note 
example of personnel wanting reports concerning review 
performance). 


The introduction of databases solved these problems. 


The Introduction of Databases 


Definition: Database 
A database is a single, organised collection of structured data, stored 


with a minimum of duplication of data items so as to provide a 
consistent and controlled pool of data. The data is common to all users 
of the system, but is independent of the programs which use the data. 


The software used to create and use databases is called a Database 
Management System (DBMS). A DBMS is a collection of different 
programs which perform specific tasks. These are explained in detail 
later, in section 11.5. Examples of DBMS are: 


e Oracle for a UNIX environment. 


e Access for the microcomputer environment. 


These are often mistakenly referred to as databases. The strict definition 
of a database is the database application, which is built using the DBMS 
software. 


The introduction of databases has involved some changes in the structure 
of large businesses and the way in which people work. In small 
businesses, databases may be both created and operated by the user. 
However, in larger businesses, the corporate database is usually created 
by technical information specialists such as a database administrator, but 
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the software, the DBMS, is acquired by the information systems 
department. 


To design a database, the organisation must describe the information it 
needs to database designers. Users participate greatly in the process of 
defining the information needed to be stored in the database. The DBMS 
is used to create the database and then users will generate and extract 
the data stored by the DBMS. 


In order for users from different departments to have access to the data in 
the database, a network is used. This is addressed in the section: Client/ 
server computing. Thus employees in a department have access to a 
centrally held pool of data which they no longer own and for which they 
are not responsible. 


The responsibility and ownership of the data acquired by the organisation 
has been taken on by the organisation and addresses the needs of the 
whole organisation. This has resulted in the introduction of new roles, in 
particular the database administrator, mentioned earlier. A job description 
can be found in the self study section at the end of this chapter. It has 
also resulted in changes to job descriptions. 


Managers need information to make effective decisions. The more 
accurate, relevant, and timely the information, the better informed 
management will be when making decisions. If a manager has access to 
the database from a computer, then he/she can look up information 
without having to request a clerk to produce it. 


Since the early 1980s, tremendous advances have been made in 
developing DBMSs for microcomputers. They are now easy enough for 
users to learn to operate without assistance and powerful enough to 
produce valuable management information. Regardless of the size of a 
business, the capabilities that a DBMS can provide are invaluable. It is 
one of the most powerful tools available for use as an information 
resource. 


Definition: data independence 


The separation of data from the programs which use the data. Nearly 
all modern applications are based on the principle of data 


independence. The whole concept of a DBMS supports the notion of 
data independence, as it represents a system for managing data 
separately from the programs using the data. 


To provide independence has been a main motivation for the 
development of database management software. It is a relative term and 
different products provide different levels of data independence. It is 
particularly important for large, shared databases which are required to 
evolve in line with user needs. The provision of data independence 
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frequently conflicts with the need for efficient (fast) processing and usually 
necessitates some compromise in terms of the software techniques used. 


Logical data independence refers to the facility to change the logical 
schema and thus evolve the content of the database; physical data 
independence refers to the facility to change the storage schema and thus 
modify and improve performance. 


Database Models 


There are three models for organising data in a database. 


e Hierarchical. 
e Network. 


e ~= Relational. 


These three models have developed since the late 1960s, but the 
relational model is now used the most extensively and is the one referred 
to in this text. Details of the hierarchical and network models can be 
found in the self study section at the end of the chapter. 


Exercise 7.8 [20 minutes] 


What are the limitations of a File Management System? 


Relational Databases 


A relational database is made up of many tables (known as relations) in 
which related data items are stored. Each relation can be considered 
conceptually as a file and is known as an entity. It has a number of rows 
(similar to records in a file) and columns (similar in concept to fields). 
Rows are called tuples and columns attributes. 


To aid understanding of the meaning of these terms, conceptually they 
can be linked to terms used in traditional file processing in the following 
way: 


e Table = relation = file. 
e Tuples = records. 
e =6 Attributes = fields. 
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Definition: relational model 


A data model which views information in a database as a collection of 
distinctly named tables. Each table has a specified set of named 
columns, with each column name (also called an attribute) being 
distinct within a particular table, but not necessarily between tables. 
The entries within a particular column of a table must be atomic (that is 
single data items) and all of the same type. The logical records held in 
a relational database are viewed as rows in these tables. Each logical 


record is thus constrained to contain only a set of elementary data 
items of a pre-specified type. 


Personnel Table (Relation) 


Employee Number | Name | Address} Telephone} Position 


B. Hall | 12 Hill St} 56723 trainee 


Salary Table 


Employee Number] Salary Pay YTD_ | Deductions YTD 
20,000 | 12,000 | 3,000 


Figure 7.5 Relational database example 


Figure 7.5 shows two tables from a company database. Name, Employee 
Number, Address etc. are attributes of the relation Personnel Table, and 
the row with 106, 20,000 etc. is one tuple of the relation Salary Table. 
Notice that each table contains the employee number. This is the 
attribute which creates the link or relationship between the rows in the 
different tables. 


In a relational database, complex logical relationships between records 
can be expressed reasonably easily. Multiple files may be easily cross- 
referenced and data accessed as if it were on a single file. Quick data 
summaries may be produced. A report generator package forms an 
integral part of the DBMS: a report format may be defined, a restricted 
view of data may be selected, and a report produced without having to 
define the procedure to obtain it step-by-step. 
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The physical organisation of data need not concern the user. If this is 
altered, the corresponding software in the DBMS would change without 
the user programs having to be altered. 


With today’s networked computers, the availability of a central database 
facility makes it possible to access data files from widely different physical 
locations as a part of a single database. The next section in this chapter 
(client/server) addresses this in more detail. 


Database Management Systems 


A database is maintained and accessed with the help of a software 
system known as a Database Management System (DBMS). This allows 
the user to store large amounts of data which can easily be retrieved and 
manipulated with a high degree of flexibility, mainly to produce results of 
searches on screen displays or different types of reports. The DBMS 
provides the facilities to: 


e Create a database. 

e Add, amend and delete data. 

e Sort and search a database. 

e Create and print reports. 

e Perform relational, logical and string operations. 


e Modify the database structures. 


What is a Database Management System (DBMS)? 


A DBMS is a comprehensive software tool which allows users to create, 
maintain, and manipulate an integrated base of business data to produce 
relevant management information. ‘Integrated’ means the records are 
logically related to one another so that all data on a topic can be retrieved 
by simple requests. The DBMS software represents the interface between 
the user and the computer’s operating system and database. 


The term database describes a collection of related records forming an 
integral base of data, which can be assessed by a wide variety of 
applications programs and user requests. 


In a DBMS, data needs to be entered only once. When the user instructs 
the program to sort data or compile a list, the program searches quickly 
through the data (in memory or in storage), and copies the required data 
into a new file for the purpose at hand. However, the user’s instructions 
do not change the original data in any way. 
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Functions of a Database Management System 


e Data independence — for example, you have created a student 
database with many student records. After some time, you decide to 
change the structure of the student database to include telephone 
numbers. The data does not need to be re-entered and the programs 
do not need to be changed. 


e Establish relationships among records in different tables (or files) — 
the user can obtain all data related to important data elements. 


e Eliminate data redundancy — data is stored only once in a file which 
can be accessed, for example, by both the student billing applications 
program or the grade averaging program. 


e Define the characteristics of the data — the user can create a 
database to store data based on specific needs. 


e Manage file access — the DBMS can examine user requests and clear 
them for access to retrieve data (by use of passwords), thus making 
data safe from unauthorised access. 


e Maintain data integrity — data is not stored redundantly, therefore, it 
needs to be updated in only one place. 


Using DBMS software, the personnel manager can now request a printout 
of all employees over 50 years of age showing employee number, name, 
post, grade, date of starting and salary. If he/she has access to the 
database from a computer, the up-to-date and accurate information 
required could be accessed and viewed on screen within a few minutes. 


A DBMS is an integral set of software programs which provides all the 
necessary capabilities for building and maintaining database files, 
extracting the information required for making decisions, and formatting 
the information into structured reports. The easiest way to view a DBMS is 
to think of it as a layer of software which surrounds the database files and 
acts as an interface between the database and the user. The software 
programs which are included in the DBMS are as follows: 


e Data Manipulation Language (DML) is used to access the data. 


e Data Description Language (DDL) is used to define the data in the 
database in terms of the fields in the files or tables. 


— They can be combined in a Data Sublanguage (DSL) and 
perhaps the most common in use today is SQL (Structured Query 
Language), commonly referred to as a query language. 


e Data Dictionary is a file containing the details of the data; it contains 
the rules for the use of the database files and fields. 


e A Transaction Log contains the record of activity which affects the 
data in a database during a transaction. It is used to make sure that a 
change made to the database is completed fully, as it may involve 
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updating more than one file and it is used to backup databases and 
rebuild files if they become damaged or destroyed. 


Data Manipulation Language 


The Data Manipulation Language is made up of the technical instructions 
for the input and output routines in the DBMS. Each applications program 
which is written, needs certain data elements to process in order to 
produce particular types of information. A list of the required elements of 
data is contained within each application program. The DML uses these 
lists, identifies the elements of data required, and provides the necessary 
link to the database to supply the data to the program. 


When a DBMS has been implemented, the data dictionary and 
transaction log are constantly in use alongside the database files. 


Data Dictionary 


A Data Dictionary is a file that contains the details of the data; it contains 
the rules for the use of the database files. The information in a data 
dictionary differs in different DBMSs, but it generally contains the following 
type of information: 

e The data available. 

e Where data is located. 

e Descriptions (attributes) of the data. 

e Ownership of the data (i.e. who is responsible for it). 

e Access to data (i.e. who may retrieve it and who may change it). 

e How the data is used. 

e Relationships between data items. 


e Limitations (Security and privacy). 


The data dictionary is in constant use as a reference tool. When data is 
requested, the DBMS refers to the data dictionary to find the details of 
where data is stored, whether the user has authority and so on. 


Transaction Log 


This contains the record of activity which affects the data in a database 
during a transaction. It is used to backup databases and for rebuilding 
files if they become damaged or destroyed. It is straightforward to use for 
backups, as all transactions are recorded and the previous day’s copy of 
the database is considered to be the current one, which is then updated 
using the transaction log. 
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Definition: database language 


A generic term referring to a class of languages used for defining and 
accessing databases. 


A particular database language will be associated with a particular 
database management system. There are two distinct classes of 
database language: those that do not provide complete programming 
facilities and are designed to be used in association with a general- 
purpose programming language (the host language), and those that do 
provide computer programming facilities (database programming 
languages). 


Exercise 7.9 [20 minutes] 


A database management system consists of a number of parts, e.g. a 


data dictionary. Identify three of the main constituents and _ briefly 
describe their role in the DBMS. 


Query Language 


Most users find a query language for data retrieval to be the most 
valuable aspect of DBMS software. Traditionally, managers rely on the 
information provided by periodic reports. However, this creates a problem 
when a decision must be made immediately and the information required 
to make it will not be available until the end of the week. 


Definition: Structured Query Language (SQL) 


A language designed for retrieving information from_ relational 
databases. 


Query languages allow managers to use everyday language to obtain 
information on demand. The information is also in everyday language. To 
be effective, a query language must allow the user to phrase requests for 
information in a very flexible format. For example: 


FIND Personnel WHERE birthyear > 1950 


where ‘Personnel’ is the name of a table and ‘birthyear’ is an attribute of 
the Personnel table. 


Notice that the programmer is supplying parameters of the table name 
and attribute name. When writing SQL code, the programmer is 
presented with a choice from a list of tables in the database and having 
specified the table, is presented with the list of attributes (fields) in the 
table. 


7-35 


Chapter 7 — Alternative Methods Programming Methods 


Exercise 7.10 [40 minutes] 


Give examples of questions a user may ask using a query language for 
the following: 


a) Interrogating an inventory database to determine whether any items 
of stock need to be re-ordered. 


b) Identifying the stock items for which the number stocked are of 
greater value than $10,000. 


c) Searching for those employees who have been with the company 
more than 25 years. 


d) Identifying those students who need to retake an exam (failed or 


referred grade). 


11.7 Database Design 
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Database design is a complex and specialised task which involves 
matching a design to the overall information needs of an organisation. A 
modular approach, department by department, is usually the preferred 
method. 


Database design has two distinct phases: /ogical and physical design. 


Logical database design is a representation of what the data actually is, 
rather than how it operates i.e. it is a description from a business 
perspective, rather than a technical one. 


Logical design involves defining user needs, analysing data elements and 
logical groups, and creating the data dictionary. Each element of data and 
the relationship between them must be identified. Two differing viewpoints 
must be included: 


e the schema (the overall database and the relationships within it); 


e sub-schema (the way in which particular records are linked to serve 
specific purposes and/or users). 


The first stage defines the users’ information needs and logically groups 
them -— this is called information requirements analysis. This is necessary 
because different departments may require the same data items and this 
could affect the logical groupings. The personnel department would be 
interested in the following data (note this list is not exhaustive, but 
contains just enough data items to illustrate the point): 


e employee number, name, home address, telephone number, start 
date, department, job title, salary grade, date of last review, date of 
next review, performance rating, office/room number, works telephone 
number. 
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As far as logical groupings are concerned, some data is needed for 
contacting the employee, some is concerned with position, grade and 
salary and other data items are concerned with the review system (date of 
last review, date of next review and performance rating). 


However some data items would also be used by the salaries department, 
e.g. employee number, name, while others would only be used by the 
salaries department, e.g. annual salary, tax to date. 


Exercise 7.11 [20 minutes] 


a) Identify the duplicate data items in the personnel and salaries data. 


b) What attribute would be used to identify an employee? 


It was mentioned that the data needs to be ‘logically grouped’ during the 
requirements phase. In addition to the problem of identifying where data 
is stored twice or more, the analyst needs to identify why data is stored 
and who uses it. By separating the data logically, password protection 
can be placed on data items so that personnel are limited to the data to 
which they need access. For example, a junior clerk will be able to add 
new data concerning reviews in terms of dates etc, but would not be able 
to enter data concerning performance rating. 


The next stage in the design is to identify the reports and the related data 
that will be required from the database for each one. 


Exercise 7.12 [20 minutes] 


Name the data elements which are likely to be required to produce a 
list of reviews that will take place next week. 


The final stage in the logical design is to refine the logical subsets of data, 
i.e. the sub-schema, and to combine them into the overall schema. The 
schema contains a description of all the data elements to be stored, the 
logical records into which they will be grouped, and the number of 
individual database files or relations to be maintained within the 
framework of the DBMS. It also describes the relationships between the 
data elements and the structure (i.e. hierarchical, relational or network 
model). 
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11.8 Database Approach —- Summary 


The main advantages and disadvantages of a database approach are 
shown below. 


Advantages Disadvantages 

No data redundancy Complexity — need for technical 
expertise 

Easy file updating Higher costs 


Data independence 


Easy program maintenance 


Increased user productivity 


Increased security Vulnerability 


Standardisation 


Figure 7.6 Advantages and disadvantages of a database approach 


Study Note 


This section has provided only a very brief introduction to database 
design. 


Further information concerning databases can be found in: 


H D Clifton, D C Ince, A G Sucliffe, Business Information Systems, 
sixth edition, Pearson Education Limited 2000 


C S French, Computer Science, fifth edition, Letts Educational, 1996 
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12 Client/Server Computing 


Storage capacity is crucial to the operation of a DBMS. The many 
gigabytes of data which move through large organisations cannot be 
handled by microcomputers. To manage these databases, large-scale 
DBMSs need to use very high capacity and high speed disks for storage. 
Such database files in a large organisation will use a number of disk 
storage devices, as well as additional ones for backup. 


Managers and other users in large organisations usually interact directly 
with the DBMS via a terminal connected to a large computer. The terminal 
allows users to communicate their requests for information to the system 
and view the results immediately. In the past, information was usually 
displayed in the form of text. However, many terminals now have colour 
graphics capability and users can view information graphically, which 
often makes it easier to understand. 


Database files are an important business resource and must be protected 
from damage, loss and unauthorised use. The most common way to 
protect them is to periodically make backup copies. The most popular 
form of backup for microcomputer hard disks is the tape streamer, or 
streaming tape unit. These devices are small, fast, and so easy to use 
that users can perform backup operations themselves. 


It has been stated that a database is held centrally and used in all areas 
of an organisation. For users to access a database from different 
workstations in different offices, it is necessary to use a network. The 
client/server is one example of this type of processing. 


Client/server computing is a concept in which an application is divided into 
multiple tasks that are executed on different hardware platforms, of which 
one is an intelligent workstation or PC (client), to achieve a net 
advantage. 


12.1 Client/Server Features 


e Asingle application can be divided into self-contained tasks. 
e Tasks can be performed by different machines. 
e One of the machines will be a PC or intelligent workstation. 


The client/server model assumes that computing is a dialogue (or many 
dialogues) between different hardware elements which together make up 
the whole task. Historically, the elements have been general purpose 
desktop computers (the clients) and special purpose processors (the 
servers) dedicated to performing only one type of function. 


In understanding the client/server model, it must be realised that 
processing can occur in different physical or logical areas in the system. 
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Once this is accepted, it becomes easy to conceive of a separation 
between client and server at almost any level. 


The main advantage of client/server systems is that each component, with 
its own set of tasks, can be optimised for a different set of operations, 
thus taking the best advantage of the computer on which it is installed. 


e The server component responsible for data storage and management 
is optimised for data integrity, security, and transaction performance. 


e The client component assumes the responsibility for presenting the 
information to the user and is therefore optimised for presentation, 
usability and ease-of-use. 


This division of responsibilities makes a lot of sense in today’s computing 
environment where many, if not all of the employees in an organisation 
are equipped with computers for word-processing, spreadsheet analysis, 
electronic communications, and so on. These PCs are often under-utilised 
by such tasks, and can be better used by diverting some of their 
processing potential towards a corporate client/server approach. 


Study Note 
This is a difficult concept to grasp. It will help to read articles in 


computing magazines in addition to reading text books on computer 
science. 


Clients and servers can be separate machines with servers providing 
particular functions such as printing or database management, or they 
can be separate processors running on either multi-computer systems or 
even on a single machine. The distinction between client and server is 
that the client initiates a request and the server fulfils it. Clients and 
servers may be parts of processes, whole processes, entire programs, 
small computers, complete networks, or even large mainframes. 


Presentation logic, which includes graphical user interfaces (GUIs), 
resides on client platforms, and applications and database management 
reside on servers. 


The client/server approach allows considerably more productive and 
versatile use to be made of raw data: ‘what-if’ spreadsheet models, real- 
time simulators, GUIs, graphics, hypermedia, presentations and 
visualisation techniques all help to make data more comprehensible and 
more meaningful. They enable computer systems to provide management 
information and decision support functions, not merely statistical 
summaries, tables of figures or form views. They do this by allowing end- 
users to call on sophisticated processing abilities at the desktop. They still 
need raw data however, and they need to be able to extract that data from 
the increasingly large amount that organisations add day-by-day. The 
problem, then, has been to combine mainframe levels of data handling 
experience with PC-type information management. The logical answer, to 
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introduce mainframes into the PC environment using client/server 
architecture, may well be the best. 


The benefits would be undeniable: user-friendly information, virtually 
unlimited storage capacity, high levels of security, reliability and 
performance, without the need to completely rethink IT strategy or 
development plans. Of course, the opinion will persist that one day the 
mainframe will be redundant, and it is almost certainly correct. But this 
has little to do with the value of the client/server paradigm. The mainframe 
can survive within that paradigm for many years to come, as a machine 
optimised for throughput, availability and raw data handling. 


Importantly, a client/server system offers a great deal of flexibility while 
allowing full access to important data. As the major functional components 
of the system are separated from one another, it becomes a simple task 
to add new front end pieces to accomplish different tasks. For example, a 
spreadsheet program can be used as a SQL server front-end to analyse 
important corporate data. At the same time, an order entry system, 
perhaps created with a high-productivity development tool such as the 
Microsoft Visual Basic programming system, can be used to enter new 
data or to query and update existing data. Graphical report-writing tools 
can be used concurrently with these other applications to provide detailed 
insight into corporate activities. Since server interface specifications are 
published and well known, any number of tools can be used safely. Users 
can mix the best front-end components to build their desired systems. 
Regardless of which tool or application is chosen, security and integrity of 
data is consistently and safely applied using the same rules for access. 


Advantages: 
e ~— Reliability. 
e Cost. 


Disadvantages: 


e §=Security of data. 
e Uneconomical use of storage. 


The main disadvantages of the client/server approach are related to 
storage and data integrity. Mainframe systems automatically back-up data 
and provide recovery; client/server storage management systems are 
currently still in their infancy and relatively unsophisticated. Users often 
have wasted space on their local storage and more importantly, it may be 
out of date. Physical security can also be a problem, when small units are 
stored in a number of different places. 
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13 Summary 


In this chapter we have covered: 


e Emerging ways of thinking about software development such as 
patterns, the Unified Software Development Process, re-factoring and 
Design by Contract. 


e Emerging development aids’ including Java _ beans, visual 
programming and class libraries. 


e More established, yet still evolving tools and techniques for 
programmers, including application program generators, database 
management systems, the database query language and client/server 
computing. 


The theme of this chapter has been the changing nature of programming, 
as those involved in programming seek greater efficiency and simpler 
ways of devising and constructing software. It is important to be aware 
that programming is an evolving discipline with new tools and techniques 
emerging on a regular basis. 


14 Self Study 


These exercises and self study recommendations are designed to help 
you learn about the alternative methods. The exercises consist of: 
e Recommended reading. 
e Internet research on key topics. 
e =Activities. 
e Review questions. Use them as follows: 
— Work through the questions and jot down your initial answers. 


— All the answers are contained in the chapter text. Go back and 
review the text to check the accuracy of your answers. Where 
your answer is incomplete or not correct, enter the correct answer 
against the question and use this for revision or for retesting ata 
later date. 
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What is the general trend in programming and _ programming 
languages? 


Why is it important to know about advances in these areas? 
What is meant by software patterns? 

What do pattern books contain? 

Why are patterns useful? 

What is commonly documented in patterns? 

What is a Unified Process? 

Name its characteristics and compare it to other lifecycle models. 


How does the UP break down activities? 


What approach is then adopted? 


What are the four main phases of the UP? 

What are the five workflow stages? 

Name the problems re-factoring tries to address. 

What approach is taken with re-factoring? 

How are you recommended to proceed with re-factoring? 
What is an assertion? 

What is a pre-condition? 

What is a post-condition? 


What is an invariant? 
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Self Study 2 


What are the benefits to programmers of class libraries? 
Are there any disadvantages? 


State what you may need to manage your selection and use of class 
libraries. 


How might visual programming languages be classified? Name three 
possible categories. 


Is VisualBasic a visual programming language? If not, why not? 
What is a Java bean? 
What differentiates beans from Java classes? 


What is the difference between design time and run time? 


What type of programmer will benefit most from beans type assembly? 


What are the common defining features of Java beans? 


Self Study 3 


Why are application program generators regarded as 4GLs? 
In what way are they user-friendly? 

What are the features of a 4GL? 

What are the advantages of a report generator? 


What is the difference between application generators, report 
generators and query languages? 


Give examples of application program generators. 
In which domains would you encounter application generators? 


14.1 Further Reading 
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Further information concerning relational databases can be found in: 


H D Clifton, D C Ince, A G Sucliffe, Business Information Systems sixth 
edition, Pearson Education Limited 2000, ISBN 0130829609. 


C S French, Computer Science, fifth edition, Letts Educational, 1996, 
ISBN 1858051649. 


Any computing text book published after 1996 and having DBMS in the 
index. 
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Additional Notes 


The following additional notes are provided: 


The duties of a database administrator. 
The hierarchical database model. 

The network database model. 

SQL. 


The Duties of a Database Administrator 


The database administrator co-ordinates the use of the database. This 
person has six main responsibilities, as outlined below: 


Database design — plays a key role in both logical and physical design 
phases, guides the definition of the database content and data 
dictionary, and sets coding, backup/restart procedures. 


Database implementation and operation — guides the use on a day-to- 
day basis, i.e. adding, deleting, controlling access, detecting and 
repairing losses, instituting recovery, and restart & backup 
procedures. 


User co-ordination — receives and reviews user requests for support, 
establishes feasibility, resolves redundant or conflicting requests, and 
establishes priorities. Enforces standards for data access, storage 
formats, data element names, etc. 


Backup and recovery — prepares the plan for the regular backing up of 
the database and establishes the procedures for recovering from 
failures, due to either hardware or software. 


Performance monitoring — responsible for making sure that the DBMS 
satisfies requirements. Regular monitoring ensures that if problems 
occur, they can be readily identified and steps taken to remedy them. 


System security — this usually involves the issuing of passwords and 
other security measures to control access to the database. 
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Hierarchical Database 


Company Database 
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Expenses, Tax 
Returns, etc 


Figure 7.7 Example of a Hierarchical Database Model 


In this model, data is organised into related groups similar to a family tree. 
There are parent records and child records. Parent records are higher up 
the tree than child records; each child can have only one parent, i.e. any 
one record can have only one record above it, but may have many below. 
The record at the top or highest level is known as the root record and this 
is the key to the model and connects the different branches. 


To store or retrieve records, the DBMS begins at the root and moves 
downward until the required record is located. Note that there is no 
connection between separate branches. 


Main advantage: 


e Data is easily stored and retrieved. 


Main disadvantages: 


e Records which are in separate groups (as in figure 7.7) cannot easily 
be linked together, so it is difficult to answer questions such as “How 
much were the expenses for a particular employee in a given month?” 


e If aparent is deleted, all the children are automatically deleted. 


e Updating is complex and requires the programmer to know all the 
links. 


e There is often data redundancy, as some data must necessarily be 
stored in more than one tree. 
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Network Databases 


These are similar to the hierarchical model, but each record can have 
more than one parent, thus overcoming the main limitation of hierarchical 
databases because they allow relationships between records in different 
groups. 


Main advantage: 


e Can provide sophisticated logical links between data. 


Main disadvantages: 


e User is limited to retrieving only data that can be accessed using the 
established links (as in a hierarchical database). 


Query Languages 


Query languages are designed to allow users to retrieve information from 
databases and to ask questions about data stored in them. Such requests 
are very similar to spoken language but they do have a specific grammar, 
syntax and vocabulary which must be used (in the same way that other 
computer languages do). This language needs to be learned by both 
programmers and users, but is not difficult. 


For example, a manager needs to know how many items in an inventory 
need to be re-ordered. 


The query language will do the following to retrieve the information: 


e Copy the data for items with quantity-on-hand less than the re-order 
point into a temporary location in the main memory. 


e =©Sort the data into order by inventory number. 


e Present the information on the computer screen (or printer). 


The manager now has the information necessary to proceed with re- 
ordering low stock items. It is important to note that the manager did not 
have to specify how to get the job done, only what needed to be done. In 
other words, in our example, the user needed only to specify the question, 
and the system automatically performed each of the three steps listed 
above. 


Some query languages also allow the user to modify databases and add 
or delete entries in the same way that database management software 
does. 


The standard relational database language is SQL. It is used within such 


popular database packages as Oracle and Ingres, and Access can 
generate SQL. 
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1 Learning Outcomes 


After completing this chapter you will be able to: 

e Explain why agreed standards in development methodology and 
documentation lead to maintainable systems. 

e Describe the systems cycle. 

e Describe the software development lifecycle. 

e Understand requirements analysis. 

e Understand system analysis. 

e Understand the need for documentation and coding standards. 

e State the attributes of good documentation. 


e Understand and be able to use the various tools and techniques used 
to document a program. 


e Achieve an awareness of the programmers’ role in software 
development. 


2 Introduction 


So far you have been introduced to a variety of approaches to software 
problem solving and coding. In this chapter, the focus is on how you can 
combine all the aspects you have encountered into a process for software 
development. 


You will be revisiting many of the tools and techniques you have already 
encountered, but this time in the context of an overall programme of 
development. 


This chapter will focus on implementation, how all the parts of the jigsaw 
come together to provide a guiding framework for systems development 
and within that, the development of software. The concepts you have 
encountered as individual components of the programming process will 
be set in the context of full scale development in action, with particular 
emphasis on the importance of documentation. 


The objective is to produce good software. To achieve this requires: 


e positive effort to record clearly i.e. document, all key aspects of 
construction (especially programming); 


e anorganised and methodical approach to the development process. 


It is important to document why the system was developed, what was 
developed and how it was developed. This is important for present and 
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future guidance e.g. for system sponsors, users, project managers, 
development team members and maintenance staff. If the personnel 
concerned with the development of the system fail to have a basic 
knowledge of its structure or rationale, it becomes extremely difficult and 
costly to correct technical problems when they arise. 


It is generally accepted that software is likely to be of a higher quality if 
developers are methodical in their approach. When considering the 
processes of software development, it has become good practice to think 
in terms of lifecycle models (LCMs). LCMs are abstract, conceptual 
models which help you think about development in an organised way. 


There are a variety of models, some of which will be described below, and 
although they differ in certain aspects, they share a common approach in 
breaking down the development process into manageable chunks. The 
key characteristic of lifecycle models is that development passes through 
a number of stages, from initial idea to installation to maintenance, and 
there are checks and balances at each significant point to ensure that the 
software is correct. 


The lifecycle begins when an application is first conceived and ends when 
it is no longer in use. At system level, it includes aspects such as initial 
concept, requirements analysis, functional design, internal design, 
documentation planning, test planning, coding, document preparation, 
integration, testing, maintenance, updates, retesting and phase-out. 


Note that lifecycle models can be applied to both the process of system 
development e.g. the construction of a large scale information system, 
and to the process of software development, e.g. the building of 
applications which may form part of a larger system. It is therefore 
possible to have a particular lifecycle model governing the overall 
development of a system and other lifecycles to govern particular aspects 
of software construction within the overall programme. In practice, the 
terms ‘system development lifecycle’ and ‘software development lifecycle’ 
appear to be used interchangeably by IT professionals and the distinction 
can become blurred. 


Whether it is a system or software lifecycle, a set of documentation ought 
to be prepared for each phase, ideally to set standards of output e.g. with 
agreed styles of layout and prescribed nature of content. 


Standardised documentation: 


e provides managers with documents to review at_ significant 
developmental milestones to ensure that requirements have been 
satisfied and resources continue to be expended; 


e records technical information to allow co-ordination of later 
development, use, and modification; 
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e provides authors of documents and managers of the project 
development with a guide to follow in preparing and checking 
documentation; 


e provides uniformity of style, format and content of documentation 
throughout the project; 


e promotes consistency. 


3 ~=«*The Traditional Systems Lifecycle 


As noted, LCMs do vary, but for explanation purposes, we will use the 
following generic model which contains all the traditional phases of a 
systems lifecycle (this model was featured in Chapter 4) and is 
recognised by all professional developers. 


The stages of development are: 


e = Initial study. 
e Requirements analysis. 


e Systems analysis. 


e Design. 
e Coding. 
e = Testing. 


e Implementation and support. 


In this model, at the end of each phase, there is a control point where the 
achievement of the goals of the phase can be evaluated. Subsequently, 
there are three possibilities for the progress of the project. 


e The results of the evaluation are satisfactory (i.e. the goals of the 
specific phase have been achieved) and the development can go 
onto the next phase. 


e The results of the evaluation are not completely satisfactory (i.e. some 
parts of the program need to be clarified, or improved). Therefore 
more work needs to be done on this phase before going onto the next 
one. 


e The results are very poor and the entire project may be discontinued, 
or restarted. 


Once a phase is completed, it is better not to return to a previous phase, 
however, this may be necessary in some specific cases e.g. to correct 
terms. 
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The Initial Study 


The main purpose of this stage is to determine the feasibility of the 
project. It is very important to define the problem as precisely as possible 
in user terms before any development begins. A solution cannot be 
devised if the problem is not clearly stated and understood. At the end of 
this stage, the minimum outputs are: 


e A brief description of the proposed system (which could include the 
hardware and software specifications). 


e Anestimate of the project cost. 


e A possible completion date for the work. 


Most methodologies start with a product definition of one specific new 
product, but do not attempt to address how that product fits in with the 
company’s strategy or existing systems. If this issue is not addressed at 
the outset, problematic system integration issues can occur in later 
product development stages. 


Requirements Analysis 


At this stage, an accurate and complete set of user requirements is 
produced to define the characteristics required for an acceptable solution. 
This information is obtained mainly by direct interviews with current and 
future users of the system. 


A Requirements Analysis document contains the following information: 


e Evidence of a clear understanding of the proposed system or solution 
between the user and the developer. 


e Alist of the existing tools and those to be acquired, available facilities 
and personnel for developing the solution. 


e Aschedule for the stages of the project with the deliverables for each 
stage. 


Poorly documented requirements specifications will more than likely result 
in problems or even complete failure of a complex software project. 
Requirements are the details describing an application’s externally 
perceived functionality and properties. 


Requirements should be clear, complete, appropriately detailed, cohesive, 
attainable, and testable. A non-testable requirement would be, for 
example, ‘user-friendly’ (because it is too subjective). A testable 
requirement would be something like ‘users must enter their previously- 
assigned password to access the application’. 
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Determining and organising requirements details in a useful and efficient 
way requires much effort; different methods are available depending on 
the particular project. There is much literature describing the different 
approaches to this task. 


Care should be taken to involve al/ of a project’s significant customers in 
the requirements process. Customers could be in-house personnel or 
outside personnel, and could include end-users, customer acceptance 
testers, customer contract officers, customer management, future 
software maintenance engineers, salespeople, etc. Those who could later 
‘derail’ the project if their expectations are not met should be included if 
possible. 


Organisations vary considerably in their handling of requirements 
specifications. Ideally, the requirements are specified in a document using 
statements such as ‘The product shall’. Design specifications should not 
be confused with requirements; design specifications are traceable to the 
requirements. 


Requirements may appear in high-level project plans, functional 
specification documents, design documents, or in other documents in 
various levels of detail. No matter what they are called, some type of 
documentation with detailed requirements will be needed by testers in 
order to properly plan and execute tests. Without such documentation, 
there will be no clear-cut way to determine if a software application is 
performing correctly. 


Systems Analysis 


The object of the systems analysis stage is to describe, in detail, a 
solution that will fully satisfy user requirements. That is, the user 
requirements from the previous stage are translated into terminology 
which can be understood by the system designers, programmers and 
testers. This includes a description of: 


e the inputs to the process; 
e the operations the system performs for each input; 


e the output obtained for the corresponding input. 


The Inputs to the Process 


The developer begins by defining the type of input to be processed by the 
system. Data may be from either external or internal sources, or a 
combination of both. 


An example of external data input is information from a manufacturing 
plant or simply a motor which is being monitored. In such a case, it is 
necessary to use special equipment to convert the information from 
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analogue into digital form, in order to be processed by a computer. If this 
information is already in digital form, i.e. from a keyboard, or another 
digital system, it can be imported directly into the system. 


If the input is already stored in a file (internal data input), the storage 
format must be specified, so that it can be read. 


Numerical input can have maximum and/or minimum values and data not 
within that range could be ignored or considered as errors. However, it is 
a common mistake to set boundaries without consulting the user. It may 
be more convenient for the implementation, but the result of this is that 
some correct data is ignored. The input can be a command, in which case 
there must be a protocol that specifies the number of parameters and 
their formats. The first parameters of the input command may specify the 
action to be carried out by the process, whilst the remaining parameters 
can be numerical values related to the action. For example, START,1,0.1 
could be an input command controlling a motor which will start motor 
number 1 at the speed of 0.1 rad./sec. 


The Process and the Output 


Output can be created in various formats, depending upon the processing 
action. At this stage, it should be kept in mind that the systems analysis 
document should not define how the system will work, but it must define 
what the system will do to satisfy user requirements. Therefore, for any 
given input, the process and the associated output should be clearly 
described. This includes prompts and terminal messages, error messages 
and warning reports, graphs, computed results, etc. 


Design 


The design stage describes how the solution will be built to meet the user 
requirements specified at the previous stage. The final set of programs is 
produced directly from this description, so it has to be a detailed, technical 
and logical definition of the final system. 


Complex problems cannot be solved in one step, but they can be divided 
into a set of sub-problems which can be more easily solved. This 
decomposition process results in a set of programs and modules that will 
interact with each other. A system test plan needs to be developed for 
each program or module that will be used in the next stage, to ensure that 
they adhere to their individual specifications. 


All programs and modules in the system are defined in terms of their 
inputs, outputs and required functions and processes. The interaction 
parameters (timing, performance requirements) between each of the 
system’s programs and modules are explicitly defined. 
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At this stage, the use of formal program design techniques and 
programming standards is recommended. The designer selects proper 
data structures and algorithms for the implementation of the input and 
output data and the system functions. The final details may be left until 
the coding stage. 


As this is the last stage before the coding of the new system, a decision 
should be taken about whether the programs are going to be developed 
internally, externally or both. The system can either be developed from 
scratch, from parts purchased as complete packages, or contracted to an 
independent programmer who will produce the required code from the 
program specifications. 


The object of this stage is to produce the computer programs that will 
make up the system. Ideally, the coding should start when the previous 
phase (the system design) is completed. However, some re-design of the 
program and modules is always necessary even if the system design is 
good. 


Coding 


The programs and modules need to be tested according to the system 
test plan, developed in the design phase. This task is carried out whilst 
the program is being created. 


This phase is complete when all the code has been written and 
documented, and compiled without errors. The system is then ready to be 
tested, which is the next phase of the lifecycle. 


What is good code? Good code is code which: 


e works; 
e is bug free; 


e is readable and maintainable. 


Some organisations have coding standards which all developers are 
supposed to adhere to, but everyone has different ideas about what is 
best, or how many rules there should be. 


Software metrics, such as McCabe Complexity metrics, used to measure 
the complexity of a program, are beyond the scope of this chapter. 
Sometimes excessive use of standards and rules can stifle productivity 
and creativity. ‘Peer reviews’, ‘buddy checks’ code analysis tools, etc. can 
be used to check for problems and enforce standards. 
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Testing 


In the previous stage, modules were tested in isolation; therefore the next 
step is to test them as a group to see how they interact with each other. 
Afterwards, the system must be tested in each environment in which it is 
likely to be used. For example, the programs may have been developed 
on machines using the latest technology, but the user may be working 
with older machines which were specified in the requirements analysis 
phase. 


Finally, the software is tested by users. Initially, it will be tested in a 
controlled environment to ensure that it satisfies users’ requirements and 
then it is tested in a live environment to uncover any hidden problems. 


This is not always possible, as some applications cannot tolerate errors. A 
nuclear reactor control program, a flight control program in an aircraft or a 
patient monitoring system for intensive care units in hospitals must be 
100% error-free before they are tested in a live environment. The only 
way to test this software is by simulating the live environment, e.g. a 
simulation of a nuclear reactor will be developed to test the system which 
will control it. 


Implementation and Support 


When all the previous stages have been completed to the satisfaction of 
everyone involved, the system is ready for implementation. This is a 
co-operative venture involving developers and client staff including 
management, end users and technical support personnel. 


After installation of the system, it must be kept operational, and updated 
according to the needs of the users. If a bug is found, it needs to be 
communicated and assigned to developers who can fix it. After the 
problem is resolved, fixes should be retested, and a resolution made 
regarding requirements for regression testing, to check that fixes did not 
create problems elsewhere. 


A variety of commercial problem-tracking/management software tools are 

available to assist in this process. 

Factors to consider in tracking the problem include: 

e The completeness of information so that developers can understand 
the bug, obtain an idea of its severity, and reproduce it if necessary. 


e Bug identifier (number, ID, etc.), status (e.g., Released for Retest, 
New, etc.), description. 


e The application name or identifier and version. 


e The function, module, feature, object, screen, etc. where the bug 
occurred. 
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e Environment specifics, system, platform, relevant hardware specifics. 
e Test case name/number/identifier and tester name. 


e Description of steps needed to reproduce the bug if not covered by a 
test case or if the developer does not have easy access to the test 
case/test script/test tool. 


e Names and/or descriptions of file/data/nessages/etc. used in the test. 


e File excerpts/error messages/log file excerpts/screen shots/test tool 
logs that would be helpful in finding the cause of the problem. 


e Severity estimate (a 5-level range such as 1-5 or ‘critical-to-low’ is 
common). 


If there is a need to update or change the system rather than replace it, 
then such work is governed by software development lifecycle models. 


Exercise 8.1 [60 minutes] 


This can be either an individual or class based discussion exercise. 
Provide answers to each of the following questions on requirements 
and systems analysis. 


a) At which point in the development lifecycle does the analysis of 
requirements appear? 


b) Whose requirements are analysed? 


c) State the information which requirements analysis may contain. 


Why is it important to document requirements carefully? 
Who needs this documentation? 
How should the document be written? 


How might the information concerning user requirements be 
gathered? 


What is the purpose of a systems analysis? 


How is it linked to requirements analysis? 


The development of software involves stages which are very similar to the 
systems development stages. The initial input to the coding process 
contains the functional specifications that the programs must fulfil. The 
programming team then begin the design of the programs, followed by the 
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production of the working code. This is then tested and amended as 
necessary until the working programs can be integrated into the proposed 
new or updated system. Feedback from the testers as a result of their 
review of the programs then creates a new cycle of software 
development. In summary, the software development lifecycle consists of 
the following stages: 


e §=Specification. 

e Design. 

e Implementation. 
e §=6Testing. 


e Review and Maintenance. 


Software lifecycle models include the following: 


e Waterfall. 

e \V-Shaped. 

e Prototyping. 
e =Incremental. 


e = Spiral. 


Waterfall Model 


This consists of a series of phases through which a project progresses in 
a sequential order, and is the nearest in type and intention to our generic 
model. Note that each phase must be completed before the project can 
progress to the next phase. At the end of each phase is some form of 
gateway, usually a formal review, where the relevant decision is made. 
The process will remain under control if all corrections only need to go 
back one stage to recover the situation. The model is workable for an 
individual or cohesive team with considerable experience, who are not 
prone to errors. 


There is no overlap between phases. Deliverables are frozen at the end of 
each phase and serve as the baseline for the subsequent phases. You do 
not see the software until the end of the project (big bang software 
development). Changes to requirements although possible, must be 
limited and tightly controlled. 


It is straightforward, simple to understand and use. This is both its 
strength and weakness, as experience has shown that if strictly applied, 
the model is too restrictive. Writing software is creative and requires 
iteration. 


This model is not popular with everyone, but it was the earliest LCM and 
sought to bring order to chaos. 
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V-Shaped Model 


The V-shaped model is similar to the Waterfall but emphasises the 
importance of considering the testing activities at the outset instead of 
later in the lifecycle. Each test phase is considered in its matching 
development phase, so system/functional testing is considered at the 
requirements stage, integration testing is considered at the high-level 
design stage and unit testing is considered at the detailed design stage. 


Prototyping Model 


Prototyping is the process of creating an incomplete model of the future 
software program prior to or during the software requirements phase. The 
client evaluates the prototype and provides feedback to the developers on 
its strengths and weaknesses. 


This feedback is used to refine or change the prototype to meet the exact 
needs of the customer. Software prototyping has many variants, however 
all methods are based on the two major types of prototyping; Throwaway 
and Evolutionary. Throwaway prototyping refers to the creation of a model 
which will ultimately be discarded, rather than becoming part of the finally 
delivered software. Its main advantage is the speed in which it can be 
produced to obtain early feedback from the client. Evolutionary 
prototyping involves building a very robust prototype in a structured 
manner and continually refining it. When built, it will form the heart of the 
new system, and can be used in the interim until the final system is 
delivered. 


Incremental Model 


In this model the software is constructed in incremental stages where 
each stage adds additional functionality. Each stage consists of: 


e design; 
e code; 
e unit test; 


e integration test; 

e = delivery. 

The incremental approach allows you to deliver functional software to the 
customer much earlier than with either the Waterfall or V-shaped models. 


The stages can be planned in such a way that you can prioritise which 
functionality to do first, i.e. you may choose to deliver the most important 
functionality to the customer first. 


It provides a tangible measure of progress but also requires careful 
planning at both the project management level and the technical level. 
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Spiral Model 


This is a_ risk-oriented software lifecycle, particularly valuable for 
innovative projects where there is no track record of development. Each 
spiral addresses major risks that have been identified. After all the risks 
have been addressed, the Spiral model culminates in a waterfall software 
lifecycle. 


It is very much an iterative approach. You start small, explore the risks, 
develop a plan to deal with the risks, and then commit to an approach for 
the next iteration. 


Each iteration involves six steps: 


Determine objectives, alternatives and constraints. 
Identify and resolve risks. 

Evaluate alternatives. 

Develop deliverables and verify that they are correct. 
Plan the next iteration. 


on Oo NF 


Commit to an approach for the next iteration. 


Depending on the model chosen, there are different implications for the 
involvement of the programmer in the process. 


Exercise 8.2 [60 minutes] 


This can be either a classroom discussion exercise, or may be 
undertaken individually. 


a) What do all lifecycle models have in common? 
b) How do they differ? 


c) Choose two lifecycle models (other than the traditional linear model) 
and identify at what stage the following may be produced: 
Documents. 
Specifications. 
Designs, including the interface. 
Test plans and test outcomes. 
User and system documentation. 
Software. 
Test software. 
Completed software. 
Prototype software. 


d) Compare and contrast the models you have chosen with the linear 
model presented in this chapter. 
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5 The Need for Documentation and Coding 
Standards 


Good documentation depends on the availability and use _ of 
comprehensive and proven standards. The purpose of standards is to 
enforce a rigorous set of guidelines on the development, production, 
testing and documentation of a product and its subsequent monitoring 
and maintenance. 


During the various stages of a project development such as fact-finding, 
system design etc., the need arises to pass on information. Sometimes 
this is done verbally, but this is short term and prone to loss or 
misinterpretation. Documents are better, but need to conform to rules to 
avoid ambiguity, duplication, omission and contradiction. Good standards 
provide a framework, and are not intended to be rigid, constraining rules. 
Often they are provided in the form of checklists. Standards need to 
govern the way the documents are prepared and maintained. 


The readability of source code has a direct impact on how well a 
developer understands a software system. Code maintainability refers to 
how easily that software system can be changed to add new features, 
modify existing features, fix bugs, or improve performance. Although 
readability and maintainability are the result of many factors, coding 
technique is particularly important. The easiest method to ensure that a 
team of developers will yield quality code is to establish a coding 
standard, which is then enforced at routine code reviews. 


A comprehensive coding standard encompasses all aspects of code 
construction and, although developers should exercise prudence in its 
implementation, it should be closely followed. Completed source code 
should reflect a harmonised style, as if a single developer had written the 
code in one session. 


At the inception of a software project, establish a coding standard to 
ensure that all developers on the project are working in unison. When the 
software project incorporates existing source code, or when performing 
maintenance upon an existing software system, the coding standard 
should state how to deal with the existing code base. 


Although the primary purpose for conducting code reviews throughout the 
development lifecycle is to identify defects in the code, the reviews can 
also be used to enforce coding standards in a uniform manner. 


Adherence to a coding standard is only feasible when it is followed 


throughout the software project from inception to completion. It is neither 
practical nor prudent to impose a coding standard after the event. 
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Three common areas for standardisation include: 


e Names, as applied to routines, variables, tables and other elements. 
e Comments within the source code. 


e Formatting of the code layout. 


Names 


One of the most effective aids to understanding the logical flow of an 
application is how the various elements of the application are named. A 
name should tell what rather than how. For example, you could use 
GetNextStudent() instead of GetNextArrayElement(). 


Difficulty in selecting a proper name may indicate that you need to further 
analyse or define the purpose of an item. Make names long enough to be 
meaningful, but short enough to avoid being wordy. In programs, a unique 
name serves only to differentiate one item from another. Expressive 
names help the human reader; therefore, it makes sense to choose a 
name that the human reader can comprehend. However, ensure that the 
names chosen are in compliance with the rules and standards of the 
relevant language. 


The following are recommended naming techniques for routines, variables 
and miscellaneous purposes: 


Routines 

In object-oriented languages, it is redundant to include class names in the 
name of class properties, such as Book.BookTitle. Instead, use 
Book. Title. 

Use the verb-noun method for naming routines which perform an 
operation on a given object, such as CalculatelnvoiceTotal(). 


Variables 


Append computation qualifiers (Avg, Sum, Min, Max, Index) to the end of 
a variable name, where appropriate. 


Use customary opposite pairs in variable names, such as min/max, 
begin/end, and open/close. 


Boolean variable names should contain Is, which implies Yes/No or 
True/False values, such as FilelsFound. 


For variable names, it is sometimes useful to include notation which 


indicates the scope of the variable, such as prefixing a g_ for global 
variables and m_ for module-level variables in Microsoft Visual Basic. 
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Miscellaneous 


File and folder names, like procedure names, should accurately describe 
what purpose they serve. 


Avoid reusing names for different elements, such as a routine called 
ProcessSales() and a variable called iProcessSales. 


When naming elements, avoid using commonly misspelled words. Also, 
be aware of differences which exist between American and British 
English, such as color/colour and check/cheque. 


Comments 
Software documentation exists in two forms. 


e External 


— external documentation is maintained outside the source code, 
such as specifications, help files, and design documents. 


e = Internal 


— internal documentation is composed of comments’ which 
developers write within the source code at development time. 


One of the challenges of software documentation is ensuring that the 
comments are maintained and updated in parallel with the source code. 
Although properly commenting source code serves no purpose at run 
time, it is invaluable to a developer who must maintain a particularly 
intricate or cumbersome piece of software. 


Recommended commenting techniques include: 


e When modifying code, always keep the commenting up-to-date. 


e At the beginning of every routine, it is helpful to provide standard 
comments indicating the routine’s purpose, assumptions and 
limitations. A boilerplate comment, one that can be reused without 
change, should be a brief introduction to explain why the routine 
exists and what it can do. 


e Avoid adding comments at the end of a line of code; end-line 
comments make code more difficult to read. However, end-line 
comments are appropriate when annotating variable declarations. In 
this case, align all end-line comments at a common tab stop. 


e Avoid using clutter comments, such as an entire line of asterisks. 
Instead, use white space to separate comments from code. 


e Make sure you remove all temporary or extraneous comments before 
releasing the program to avoid confusion during future maintenance 
work. 


8-17 


Chapter 8 — Implementation 


5.3 


8-18 


Programming Methods 


e Use complete sentences when writing comments. Comments should 
clarify the code, not add ambiguity. 


e Comment as you code, because most likely there will be no time to do 
it later. It is also a good idea to revisit code previously written, if the 
opportunity arises. 


e Use comments to explain the intent of the code. They should not 
serve as inline translations of the code. 


e Comment anything that is not readily obvious in the code. 


Format 


Formatting emphasises the logical organisation of the code. Taking the 
time to ensure that the source code is formatted in a consistent, logical 
manner is both helpful to you and to other developers who must decipher 
the source code. 


Here are some formatting techniques: 
e Establish a standard size for an indent, such as four spaces, and use 
it consistently. Align sections of code using the prescribed indentation. 


e Use a monospace font when publishing hard-copy versions of the 
source code. 


e Except for constants, which are best expressed in all uppercase 
characters with underscores, use mixed case instead of underscores 
to make names easier to read. 


e  Indent code along the lines of logical construction. Without indenting, 
code becomes difficult to follow, such as: 


If ... Then 
If ... Then 


Else 
End If 
Else 


End If 


Indenting the code yields easier-to-read code, such as: 


If ... Then 
If... Then 


Else 
End If 
Else 


End If 
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e Establish a maximum line length for comments and code to avoid 
having to scroll the source code window and to allow for clean hard- 
copy presentation. 


e Use space to provide organisational clues to source code. Doing so 
creates ‘paragraphs’ of code, which aid the reader in comprehending 
the logical segmenting of the software. 


e Break down large, complex sections of code into smaller, 
comprehensible modules. 


Study Note 
It is worth undertaking extra research to see what other coding 


| standards exist, in addition to those mentioned here. | 


Using solid coding techniques and good programming practices to create 
high quality code plays an important role in software quality and 
performance. In addition, by consistently applying a well-defined coding 
standard and proper coding techniques, as well as holding routine code 
reviews, a team of programmers working on a software project is more 
likely to yield a software system that is easier to comprehend and 
maintain. 


Exercise 8.3 [60 minutes] 
You are responsible for setting coding standards for your development 
team. Write a document stipulating the standards you wish to impose, 
paying particular attention to: 

e names 


comments 


e formats 


Be positive in your stipulations e.g.” Indentations will be...”. 


The Attributes of Good Documentation 


Every phase of the lifecycle generates documentation. Specifications, 
designs, business rules, inspection reports, configurations, code changes, 
test plans, test cases, bug reports, user manuals, etc. should all be 
documented. Documentation can be electronic, (e.g. online documents for 
networked sharing and access, built in help systems), and visual (e.g. 
sketches and diagrams, as well as text). 


Good documentation requires time and effort and an appreciation of who 
the audience will be as this will dictate the style and tone of the work. Do 
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not create unnecessary documentation — be selective and consider if the 
output is really needed. 


The primary attribute of good documentation is that it must be: 


e complete and up-to-date; 
e well structured, with neither too much information nor too little; 
e indexed; 


e presented in a standard form (which needs to be defined). 


It is annoying for a programmer to study, and make changes to, someone 
else’s work on the basis of the information given in the documentation and 
then to find that some prior changes were not documented, thus 
invalidating the latest amendments. 


As already stated, good documentation should be well structured, with 
neither too much information nor too little. To achieve this, the writer 
should be able to construct clear and concise technical prose. If too much 
information is provided for users, it will confuse rather than help. It is 
important to write taking the audience into account, e.g. end users need 
different information than technical support staff, and like it presented ina 
different style. 


The number and the size of computer programs produced is continually 
increasing. For this reason, it is necessary to index documents to allow 
users to find the information that they require. The information should be 
presented in a standard format so that it is easier to locate, e.g. if there 
are different styles of information presentation, users will have to become 
familiar with each of them before being able to find what they want. 


Exercise 8.4 [30 minutes] 


Write a list of objectives that authors of systems documentation should 
aim to achieve. 


7 The Elements of Documentation 


All large software systems, irrespective of their applications, have a 
prodigious amount of documentation associated with them. This 
documentation can be classed as either: 


e user documentation, or 


e system documentation. 
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User documentation describes the functions of the system, without 
reference to how they are implemented. System documentation includes 
all aspects of the system design, implementation and testing. 


End user documentation (manuals and guides) is written by technical 
specialists with input from the systems analyst and the programmer. They 
will produce three types of document, one for each person who is going to 
use the system after its completion: 


e The operator. 
e The user. 


e The maintainer. 


Users, unless they are going to set up the system themselves, do not 
need to know detailed set-up procedures, but operators do. Maintainers 
need detailed design knowledge, while users do not. 


User Documentation 


User documentation can begin with the User Request and finish with end 
user guides to the system. 


The User Request is the starting point for every software project and it is 
crucial to get it right from the beginning. Principally, it is a written and 
approved statement of the nature and objectives of the project. In general 
it will: 


e State the Problem: this is the first formal approach to initiate a project. 
The users/clients (with the assistance of a systems analyst in some 
cases) will provide all the required information. For example, details of 
the company, definition of the problem and available studies or back- 
up material of Known or potential value to the systems analyst. 


e Assess the Feasibility: this is a description of a proposed approach to 
the project, which states the objectives and parameters of the project. 
The users will then be able to agree to and approve the proposed 
changes. 


e Plan and Schedule for Implementation: this is a timetable which 
contains instructions and priorities for subsequent development work. 
It will also state an approximate date for completion of the project. 


In general, these specifications are written by neither programmers nor 
systems analysts, but by the specialists in the various fields (banking, 
insurance, engineering) who can understand and talk to the clients in their 
own jargon, so that they clearly define the problems and their solutions. 


User manuals should provide fast access to accurate information, as very 
few users are willing to read a manual from cover to cover. Depending on 
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the size of the system, this document may be provided as two separate 
manuals or bound together. 


The first part of the guide or manual is for novice users and it describes 
how to get started on the system and how the user might make use of the 
common system facilities. It should include examples that the user can 
easily follow step by step, and information on how to recover from the kind 
of mistakes that beginners inevitably make. 


The second part of the guide or manual should be more technical and 
intended for experienced users. This is the definitive document on system 
usage and therefore should be complete, e.g. it should describe error 
reports generated by the system and have a comprehensive index. The 
manual itself is produced as a resource for those wishing to use specific 
commands or options in the system. 


If the system is at all complex, there will probably be a detailed step by 
step account of the use of the system, with a set of worked examples, and 
some examples of finished products to assist the user in the initial steep 
learning curve. The best application systems also have online help, and 
perhaps a work-through tutorial. 


System Documentation 


Development of the systems design documentation is usually performed 
by the systems analyst who interfaces between the user and the 
programmer. It requires the production of sub-set specifications, namely: 


e Overall Systems Specifications: this is a general description of the 
complete system giving a non-technical view of the proposed system. 
It also shows how the requirements are decomposed into a set of 
interacting programs. This is not required when the system is 
implemented using only a single program. 


e =Input/Output Specifications: this describes all the inputs and outputs 
of the system. This includes the format, the values (min, max), the 
purpose, the frequency and the volume of information coming in and 
out of the program. 


e Program Specification: this details the hierarchical structure of the 
programs, the data flow between programs and program interfaces. It 
also includes information about internal program design, such as 
choice of algorithms and flowcharts, or pseudo-code. 


e Table of Contents: all the modules should be listed with a brief 
description of their functions. This should serve as the index. 


Module documentation is the lowest level of documentation outside the 
source code itself and it will be the most useful when the program needs 
to be maintained or improved at a later stage. The amount of 
documentation will depend mainly on whether or not it is an internal 
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module, requiring only a few lines of code, or a module implemented as 
external procedures, in which case more documentation is required as its 
function is more complex. The description of each module should be 
functional and contain only the minimum necessary information. It should 
include the following: 


e Name of the module (which should preferably be related to its 
function). 


e Function of the module. 
e List of routines called. 


e Input data with their type, format and range of values: this consists of 
an external data dictionary. 


e Local variables used: this is called an internal data dictionary and 
includes flags, calculation results, tables and other temporary 
variables. 


e Description of algorithm, if one has been used and it is not a well- 
known one. 


e Identification of internal functions/procedures with any access to 
global variables listed. 


e Data and program flowcharts. 


e Error handling, which is a list of the types of error detected and the 
action taken. 


e The tests performed on the module and their results. 


Source code documentation is internal and should be kept to a minimum 
and written in such a way that it does not break the flow of the code, but 
above all it should be accurate. 


The operator’s manual is intended for technicians and explains how to 
install the system and tailor it to particular hardware configurations (e.g. it 
should describe the minimal hardware configuration required to run the 
system and any permanent files that need to be created). However, most 
software applications nowadays have a set-up file which sets up the 
system and configures the computer automatically. 


The system maintenance manual is needed for a thorough understanding 
of the data organisation, program functions, error messages, and 
recommended recovery actions. If necessary, it may also give detailed 
information about hardware intervention (e.g. changing a_ printer 
cartridge). 
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8 Techniques of Documentation 


It is important to communicate as clearly as possible, and various 
techniques may be used. As well as text descriptions, visual 
representation of concepts and designs works well. For example, for 
object-oriented systems there is the range of diagramming techniques 
covered by the Unified Modelling Language (See Chapter 5). 


In structured programming, flowcharts can be used to document system 
design and to indicate the organisation of a program (See Chapter 2). 
They help to make a program/system structure immediately apparent to 
the reader. 


Program or procedure names are written in rectangles, with the named 
inputs, outputs and backing storage elements being shown for each. The 
direction of data flow is shown by means of arrows. Each file is drawn 
only once and arrows are used to link it to those programs which use it 
either for input or output. Decisions are written in diamond-shaped boxes 
and are used to indicate conditional changes. 


Other chart forms used in structured programs include Hierarchy-Input- 
Process-Output (HIPO) and Warnier-Orr charts. HIPO charts are similar 
to Warnier-Orr in that they show structure but they do not indicate module 
interfaces or any procedural details. HIPO charts are produced at the 
stage of the system design process when the analyst is ready to start on 
the data design and program design. They are used to identify the major 
functions of each program and the major elements of the data without 
implying any particular data organisation, program/subprogram hierarchy 
or choice of algorithm. 


Warnier-Orr diagrams are a good visual representation of the data 
structuring and refinement process (e.g. the activity structure for a given 
system). It is also possible to code directly from a Warnier-Orr diagram, 
depending on the level of detail. 


Study Note 
More detailed information about HIPO and Warnier-Orr diagrams can 


be found in a number of standard computing reference books, or by 
searching the World Wide Web. A short tutorial on Warnier-Orr 


diagrams can be found at http://www.kenorr.com/articles.html. 


In addition to paper based documentation there is increasing use of digital 
forms of documentation. These include those forms which remain close in 
style and design to the paper equivalent, but can be read on screen e.g. 
text based web pages, documents produced in the Portable Document 
Format (PDF), which are readable via the freely available Adobe Acrobat 
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Reader software, or online help systems commonly incorporated into 
application software programs and modern operating systems. 


Somewhat different are the multimedia programs now commonly available 
as training aids for users learning off-the-shelf software programs. These 
use a mix of text, animation, sound and video to convey knowledge of the 
program and are particularly good at explaining procedures, or how 
software works. Multimedia programs communicate clearly, but because 
of the mix of media are expensive to produce, especially for a one-off 
system, and they can also be data intensive (video in particular results in 
very large files). 


Screen grab tools which let you capture screen images, can enhance both 
paper and digital documentation by providing pictures to support the text 
or other media of explanation (See the screen grabs featured in Chapter 
7). A variation on the screen grab (of single images) is the screen cam 
(short for camera) which can capture in sequence a range of screen 
changes including cursor movements, and then run that sequence of 
changes as if a movie. These short animations are quite easy to create 
but are also data intensive, so they work best on local machines rather 
than across networks where data transfer rates are a problem. 


Exercise 8.5 [60 minutes] 


It is likely that any commercial software program or the operating 
system you use has a help system associated with it. Soend some time 
using the system to understand how it works then carry out a critical 
review. 
Consider these points: 
How easy is it to find what you want to know? 
How easy is it to understand the information provided? 
In which form is help provided? 
- words? 
- pictures? 
- both? 
Which do you find better? 


Do you consider the help system a good _ technique 
documentation? 


What are the benefits? 
What are the disadvantages? 


Do you prefer paper-based documentation? If so why? 


V1.1 8-25 


Chapter 8 — Implementation Programming Methods 


9 The Programmer’s Role 


Now that we have had an overview of the system development process it 
is easier to see where the programmer fits in. 


In the main, the programmer is introduced at the coding, testing, 
implementation and support stages. Generally, after the design stage, and 
upon completion of the detailed program specifications, the systems 
analyst will call upon the programming team to develop the programs that 
will fulfil the functional requirements of the system. 


However, it should be noted that if a prototyping or highly iterative 
approach is taken, the programmer may feature earlier in the process, as 
demonstrator coded interfaces may be used to determine customer 
requirements. 


It is also the case that in smaller organisations the role of analyst and 
programmer may be integrated, as it may not be economically justifiable 
to have these roles filled by separate people. In this case the programmer 
can expect to be involved in the specification and design stages as well 
as coding, testing and implementation. 


The role of the programmer is also changing regarding the software tools 
he/she is expected to know and use. There is much greater use now of 
higher-level tools, software reuse and desktop applications for localised 
information systems. Increasingly, tools have visual techniques for 
outlining coding requirements. There is increasing emphasis on assembly 
of software parts, rather than programming from scratch. Tools for design 
and analysis are being integrated with tools for code generation. 


The software industry continues to evolve, which inevitably means 
changes in the role of the programmer. 


Exercise 8.6 [30 minutes] 


This is a classroom discussion exercise, although you may wish to 
write down your own views anyway. 


After reading this chapter, and taking into account all that is covered in 


the rest of the workbook, what do you think is the programmer’s role 
today? Has it changed? Do you think it will change in the future? If so, 
why? Support any statements you make or views you hold with 
evidence from this workbook or the readings and research you have 
undertaken. 
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10 Summary 


In this chapter we have covered: 
e The traditional systems lifecycle and the various stages associated 
with it. 


e Avariety of Software Development Lifecycle models and an overview 
of the increasing sophistication in model development. 


e The need for documentation and coding standards. 
e The attributes of good documentation. 

e The elements of documentation. 

e Techniques of documentation. 


e Areview of the programmer’s role. 


The purpose of this chapter was to put programming into a working 
context and to stress the importance of working to sound procedures and 
standards, because ultimately, programming requires a_ rigorous 
approach, and is not forgiving of error. The risk of error is considerably 
reduced by working to standards and error detection is made more likely 
by good documentation. 


11 Self Study 


These exercises and self-study recommendations are designed to help 
you learn about implementation. The exercises consist of: 

e Recommended reading. 

e Internet research on key topics. 

e = Activities. 


e Review questions. 


Use them as follows: 

Work through the questions and jot down your initial answers. 

All the answers are contained in the chapter text. Go back and review the 
text to check the accuracy of your answers. Where an answer is not 


correct or incomplete, enter the correct answer against the question and 
use this for revision or for retesting at a later date. 
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Self Study 1 


What are lifecycle models? 
How do they help in the software process? 
What are the stages of a traditional systems lifecycle? 


What is involved in requirements analysis and what does it 
achieve? 


What is the purpose of the systems analysis? 

How easy is it to specify systems requirements? 

In design, how are complex problems best solved? 
What is meant by good code? 

What is tested at the coding stage? 

When do users contribute to the testing process? 


What factors are to be considered in tracking down a technical 
problem? 


Self Study 2 
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What is the difference between system development lifecycle and 
software development lifecycle? 


Name other lifecycle models. 


Why is the Waterfall model regarded by some as outmoded? 


When is the programmer involved in these different models and at 
what stage? 


How does the Unified Process fit into these models? 


When might you use one rather than another? 
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Self Study 3 
Who benefits from standardised documentation? 
What does good documentation depend on? 
What is meant by internal documentation? 
What is one of the big challenges of software documentation? 
When is it best to add comments? 
What does formatting do? 
What are the attributes of good documentation? 


Define the elements of system documentation. 


What does writing system documentation generally involve? 


Name the different specification documents which form part of the 
system design documentation. 


Self Study 4 
e What diagramming techniques are available? 


e What do you see as the programmer’s role? 


Go back and select some diagramming techniques from Chapter 5. Think 
of how you might include them into system documentation. 
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History of Computing — Key Events 


Date 


Pre-computers 


8-9th century 


17th century 


1614 
1615 


1623 


1645 
1672-74 
1801 


1820 


1822 
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Event 
The abacus invented, known to the Romans, Greeks and Babylonians. 


Hindu/Arabic system of numbering invented which used numbers rather 
than letters. The Hindus had first used zero around the 9th century BC, 
and by the 7th century AD had developed a decimal system. This was 
adopted by the Arab world, which went on to develop it and also to 
introduce the system into Europe. The Chinese by this time had been 
using negative numbers, and using ‘powers of ten’ to express magnitude. 


The algorithm developed by Muhammad ibn Musa Al’Khowarizmi. The 
algorithm is a series of steps that can be followed in order to solve a 
problem. A talented mathematician, Muhammad ibn Musa Al’Khowarizmi 
was also responsible for the introduction of the zero into the Hindu/Arabic 
system of numbering and wrote a book on algebra. The word algorithm 
comes from a corruption of his name. 


Napier invents the logarithm and several calculating machines. The 
calculating machines included Napier’s bones and the chessboard 
calculator. Napier was also responsible for popularising the decimal 
point, which had been invented in Holland. 


John Napier invented logarithms. 


The slide rule is invented by William Oughtred. The slide rule was based 
on the calculations made by Napier and this became the principle means 
of calculation until the early 20th century. 


Wilhelm Schickard (1592-1635) invented the mechanical calculating 
machine that can simplify the multiplication of long numbers. Schickard’s 
machine used Napier’s bones as its basis. 


Blaise Pascal develops an adding machine 
Gottfried Leibniz built his first calculator, the Stepped Reckoner. 


Jacquard looms and punch cards. Joseph Marie Jacquard invented a 
system that used cards with holes punched into them to automate the 
production of designed carpets. Previously, a weaver had been 
responsible for the planning of the pattern, and setting up of the loom. 
The jacquard loom relied on a card, which once programmed could 
control the production of the pattern and repeat it for as long as the 
carpet required. 


The first mass-produced calculator, the Arithometer, was developed by 
Charles Thomas de Colmar (1785-1870). 


Charles Babbage completed his first model for the difference engine, 
which could produce complex mathematical calculations. 
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1830s 


1890 
1936 
1938 


1939 


1940s 
1943 


1945 


1946 


1948 


1950s 


1951 


1952 


1953 
1957 
1958 


Late 1950s 
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Babbage created the first design for the analytical engine. The analytical 
engine, though it remained unfinished until recent times, is often quoted 
as being the first computer. 


Augusta Ada, Countess Lovelace and niece of Lord Byron. Developed 
punch cards for the Analytical Engine. Invented subroutines. 


Herman Hollerith developed the punched card system for the US census. 
Alan Turing published the mathematical theory of computing. 


Konrad Zuse constructed the first binary calculator, using Boolean 
algebra. 


US mathematician and physicist J V Atanasoff (1903-1995) became the 
first to use electronic means for mechanising arithmetical operations. 


Machine code — uses 1s and Os or binary 


The Colossus electronic code-breaker was developed at Bletchley Park, 
England. 


Konrad Zuse developed a language called Plankalkul that was never 
implemented. 


The Harvard University Mark | or Automatic Sequence Controlled 
Calculator (partly financed by IBM) became the first program-controlled 
calculator. ENIAC (acronym for Electronic Numerator, Integrator, 
Analyser, and Computer), the first general purpose, fully electronic digital 
computer, was completed at the University of Pennsylvania, USA. 


Manchester University (England) Mark |, the first stored-program 
computer, was completed. 


Assembly language. 

Structured programming. 

Dijkstra and his GO TO statements. 

Symbolic notation. 

Expression compilers. 

William Shockley of Bell Laboratories invented the transistor. 
Launch of Ferranti Mark I, the first commercially produced computer. 


Whirlwind, the first real-time computer, was built for the US air-defence 
system. Grace Murray Hopper of Remington Rand invented the compiler 
computer program. 


EDVAC (acronym for Electronic Discrete VAriable Computer) was 
completed at the Institute for Advanced Study, Princeton, USA (by John 
Von Neumann and others). 


Magnetic core memory was developed. 

FORTRAN introduced by John Backus 

The first integrated circuit was constructed. 

LISP by John McCarthy at MIT (Massachusetts Institute of Technology). 
ALGOL. 

COBOL by Grace Hopper. 
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1963 


1964 


1965 
Late 1960s 


Early 1970s 
1971 


1972 
1974 


1975 


Late 70s 
1980s 


1981 


1984 


1985 


1986 
1988 


1989 
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The first minicomputer was built by Digital Equipment (DEC). The first 
electronic calculator was built by Bell Punch Company. 


Launch of IBM System/360, the first compatible family of computers. 


John Kemeny and Thomas Kurtz of Dartmouth College invented BASIC 
(Beginner’s All-purpose Symbolic Instruction Code), a computer 
language similar to FORTRAN. 


The first supercomputer, the Control Data CD6600, was developed. 
Better performance from languages. 

Still necessary to book time on mainframes and minicomputers. 
More languages appear and others are improved. 

PROLOG. 

The first microprocessor, the Intel 4004, was announced. 

Pascal designed as a teaching tool. 

C developed to support portable operating system design. 


CLIP-4, the first computer with a parallel architecture, was developed by 
John Backus at IBM. 


Altair 8800, the first personal computer (PC), or microcomputer, was 
launched. 


Smalltalk designed as a beginner’s language and environment. 
First personal computers appear. 

Object-oriented programming. 

Client Server concept. 

Turbo Pascal. 

Data abstractions. 

ADA. 


The Xerox Star system, the first WIMP system (acronym for Windows, 
Icons, Menus, and Pointing devices), was developed. IBM launched the 
IBM PC. 


Apple launched the Macintosh computer. 


The Inmos T414 transputer, the first ‘off-the-shelf microprocessor for 
building parallel computers, was announced. 
C++ 


The first optical microprocessor, which uses light instead of electricity, 
was developed. 


Wafer-scale silicon memory chips, able to store 200 million characters, 
were launched. 


World Wide Web, invented by Tim Berners-Lee who wanted to use 
hypertext to make documents and information seamlessly 
accessible over different kinds of computers and systems, and 
wherever they might be in the world. 
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1990s 


1990 


1991 
1992 


1993 


1995 


1996 


1997 


1998 
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Networks. 
TCL, PERL, Java in the mid 90s. 


Microsoft released Windows 3, a popular windowing environment for 
PCs. 


Visual Basic. 


Philips launched the CD-I (Compact-Disc Interactive) player, based on 
CD audio technology, to provide interactive multimedia programs for the 
home user. 


Intel launched the Pentium chip containing 3.1 million transistors and 
capable of 100 mips (millions of instructions per second). The Personal 
Digital Assistant (PDA), which recognises user’s handwriting, went on 
sale. 


Intel launched the Pentium Pro microprocessor (formerly codenamed 
P6). 


Java is released. 


Netscape Navigator 2.0 released. First browser to support JavaScript. 
Intel released the 150 & 166 MHz versions of the Pentium Processor. 


Linux 2.0 released. 2.0 was a significant improvement over the earlier 
versions: it was the first to support multiple architectures. 


Hotmail, founded by Sabeer Bhatia and Jack Smith, is commercially 
launched on Independence Day in the United States. 


Intel released the 200 MHz version of the Pentium Processor. 


Tim Berners-Lee awarded the Institute of Physics' 1997 Duddell Medal 
for inventing the World Wide Web 


IBM’s computer Deep Blue beat grand master Gary Kaspanov at chess, 
the first time a computer has beaten a human grand master. Deep Blue is 
reputed to have crashed at least twice during the match but this was not 
counted as a lost game! 


Intel releases the 233 MHz Pentium MMX. 


After 18 months of losses Apple Computer was in serious financial 
trouble. Microsoft invested in Apple, buying 100,000 non-voting shares 
worth $150 million. One of the conditions was that Apple was to drop the 
long running court case - attempting to sue Microsoft for copying the look 
and feel of their operating system when designing Windows. 


Intel released the 333 MHz Pentium II processor. 


Plans are announced in the USA for Internet2, a high-speed data 
communications backbone that will run on a second network Abilene. 
Serving the main US research universities, it will enable them to bypass 
congestion on the Internet and should be operational by 1999. 


A US court finally banned the long-running practice of cybersquatting or 
buying domain names relating to trademarks and then selling them for 
extortionate prices to the companies who own the trademark. 


Apple announced the iMac, an all-in-one with integral 15 inch (381 mm) 
multiscan monitor, 24x CDROM, 2x available USB ports, 56 kbit/s 
modem, 2 stereo speakers, and Ethernet, but no floppy drive. 


Microsoft released Windows 98. 
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Apple releases the PowerMac G4. 


Official Launch of Windows 2000 - Microsoft's replacement for Windows 
95/98 and Windows NT. 


Sony releases the PlayStation 2. 

Intel releases the Pentium IV. 

Apple released Mac OS X. 

Apple Computer released the now famous iPod. 


Microsoft released Windows XP, based on Windows 2000 and Windows 
NT kernel. 


Microsoft released Xbox, a game console with a flagship title Halo. 
United Linux officially formed. 


Mozilla Firefox 1.0 released, Microsoft Internet Explorer's biggest 
competitor since Netscape Navigator. 


nVidia releases GeForce 6800, claiming it is the biggest leap in graphics 
technology the company ever made. 


Jef Raskin, who in 1979 envisioned and established the Apple Macintosh 
project at Apple Computer, dies at the age of 61. 


Apple Computer releases Mac OS X v10.4 for the Apple Macintosh. 


Microsoft debuts the Xbox 360, their second-generation console with 
wireless controllers, integrated online gaming, surround-sound and high- 
definition graphics. 


Apple Computer introduces the MacBook Pro, their first Intel-based, dual- 
core mobile computer, as well as an Intel-based iMac. 


Microsoft Corporation launches Windows Vista more than 5 years after 
their last major, new operating system, Windows XP, was released 
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Glossary 


4GL Fourth generation 
languages 


Abstract data type 


Abstract class 


Abstraction 


Access 
Actual parameter 


Algorithms 


Applications program 
generator 
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A term used in the data processing community for a high-level 
language that is designed to allow users who are not trained 
programmers to develop applications, in particular for 
querying databases and generating reports. 4GLs are usually 
nonprocedural languages in which the user describes what is 
wanted in terms of application, not the computer. The 
processor takes the user’s description and either interprets it 
directly or generates a program (in a database query 
language or COBOL) that will perform the desired operation. 
For this reason the latter are sometimes called application 
generators. 


A data type that is defined solely in terms of the operations 
that apply to objects of the type without commitment as to how 
the value of such an object is to be represented. 


An abstract class may have one or more subclasses, but 
never an instance. In other words, an abstract class may be 
inherited by another class, but cannot become an instance of 
that class. A reptile is an abstract class, its data and 
responsibilities can be inherited, but a reptile is not an animal 
in its own right. See concrete class. 


Abstraction enables the developer to re-use a class, and filter 
out operations and attributes from that class that are 
superfluous to needs. An object may have a long list of 
associated features that are not always relevant to the system 
that the developer wishes to create. See Information hiding. 


An inexpensive single user database management system 
developed by Microsoft Inc. 


Information passed to a subprogram at the call. See also 
parameter, argument. 


The method of solving the problem in a form that can easily be 
understood and then translated into the actual programming 
code required. 


A program — a software tool — that is capable of creating a 
range of application programs in a particular domain. The 
generated program will be configured by information provided 
by the person using the application generator. Domains in 
which application generators are frequently encountered 
include simulation, process control and user _ interface 
software. See also fourth generation language. 


Glossary 


Argument 


Array 


Artificial Intelligence 


Assignment statement 


Association 
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A value or address passed to a procedure or function at the 
time of call. Thus in the BASIC statement Y=SQR(X), X is 
the argument of the SQR (square root) function. Arguments 
are sometimes referred to as actual parameters. 


An ordered collection of a number of elements of the same 
type, the number being fixed unless the array is flexible. The 
element of one array may be of type integer, those of another 
array may be of type real, while the elements of a third array 
may be of type character string (if the programming language 
recognises compound types). 


Each element has a unique list of index values that determine 
its position in the ordered collection. Each index is of a 
discrete type. The number of dimensions in the ordering is 
fixed. 


A one-dimensional array, or vector, consists of a list of 
elements distinguished by a single index. If v is a one 
dimensional array and / is the index value, then v; refers to the 
ith element of v. If the index ranges from L through U then the 
value L is called the lower bound of v and U is the upper 
bound. Usually in mathematics and often in mathematical 
computing the index type is taken as integer and the lower 
bound is taken as one. 


A term used to describe the discipline which aims to make 
computer systems behave more like people. 


A fundamental statement of all programming languages 
(except declarative languages) that assigns a new value to a 
variable. The typical form in Algol-like languages is 
variable:=expression where := is read as “becomes”; the 
symbol suggests a _ left-pointing arrow to signify the 
conveyance of a value to the variable on the left. Other 
languages (particularly BASIC, C and Fortran) use = as the 
assignment operator, e.g. a=b+c. This leads to problems in 
expressing the concept of equality. BASIC, being an 
unsophisticated language, is able to use = for both purposes; 
C uses = = for equality and Fortran uses EQ. 


Objects are often associated with each other, in the same way 
that people might be associated with each other in daily life. 
Associations represent relationships between different, not 
similar, object classes (thus a person works in a company; a 
company has a number of offices). These object relationships 
can either be one-way or they can be two-way. Objects can 
also have relationships that are either one-to-one, they may 
be one-to-many or alternatively an object can have different 
relationships with the same object. These associations are 
known as multiplicity. 
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Glossary 


A defined property of an entity, object etc. 


Software objects have state and behaviours; these form the 
structure of the object. The object’s state is made up of items 
of data called attributes (or properties) which describe 
aspects of the object, whilst its behaviours are the operations 
that the object carries out. 


The object's behaviours are the things that it knows how to 
do. For example where the object is a car, the behaviours 
would be such as accelerating, breaking, changing gear and 
turning the windscreen wipers on. Software objects have 
state and behaviours; these form the structure of the object. 
The object’s state is made up of items of data called attributes 
(or properties) which describe aspects of the object, whilst its 
behaviours are the operations that the object carries out. 


A numbering system to the base two, thus using only the 
digits O and 1 (bits) which is the basis of computer logic. 


This not based on any knowledge of internal design or code. 
Tests are based on requirements and functionality. See white 
box testing. 


The mathematics and logic applicable to binary situations, e.g. 
on/off, yes/no, TRUE/FALSE. 


A program used to display documents which reside on the 
World Wide Web. The most common. ones are Microsoft 
Internet Explorer and Netscape Navigator/Communicator. 


A form of sorting by exchanging that simply interchanges pairs 
of elements that are out of order in a sequence of passes 
through the file, until no such pairs exist. This method is not 
competitive with straight insertion. 


An error in a program. (de-bug means to remove errors in the 
program) 


An object-oriented version of the C programming language. 


To transfer control to a subroutine or procedure, with provision 
for return to the instruction following the call at the end of the 
execution of the subroutine / procedure. 


A conditional contro! structure that appears in most modern 
programming languages and allows a selection to be made 
between several sets of program statements; the choice is 
dependent on the value of some expression. The CASE 
statement is a more general structure than the IF THEN ELSE 
statement, which allows a choice between only two sets of 
statements. 


Glossary 
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An object is an instance, or example, of a category or class. 
The house system is a class or category, and the types of 
housing e.g. apartment, bungalow are instances of that class. 
In the real world, we are instances of the people class, and we 
may also be instances of other classes such as the brother 
class, mother class, and student class. 


A class diagram describes the types of objects in the system 
and the various kinds of static relationships that exist among 
them. 


Objects can be grouped together that share similarities and 
properties. Classification looks at the shared attributes and 
behaviours to create or describe classes of objects. 


Used to describe two computers — one of which manages the 
data (server) and provides the data for the client computer 
which has software to manipulate the data. For example, with 
databases where the server computer provides the data and 
the client — the user’s computer — uses presentation software 
to present the data in graphical format. 


A concrete class is a class that can have one or more 
subclasses and/or instances. Instances of the reptile class 
such as lizard and crocodiles are concrete classes. They may 
have subclasses such as iguana and occur a number of times 
— have many instances. 


A program that translates high-level language into absolute 
code, or sometimes into assembly language. The input to the 
compiler (the source code) is a description of an algorithm or 
program in a problem-oriented language; its output (the object 
code) is an equivalent description of the algorithm in a 
machine-oriented language. 


A quantity or data item whose value does not change. 
A value that is determined by its denotation, i.e. a literal. 


A syntactic form in a language to express flow of control. 
Common control structures are: 


IF ... THEN ... ELSE, WHILE ... DO, 
REPEAT ... UNTIL, CASE 


A generic term referring to a class of languages used for 
defining and accessing databases. A particular database 
language will be associated with a particular database 
management system. There are two distinct classes of 
database language: those that do not provide complete 
programming facilities and are designed to be used in 
association with some _ general-purpose programming 
language (the host language), and those that do provide 
computer programming facilities (database programming 
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Database Management 
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languages). Some products adopting the former approach 
seek to minimise host-language programming by the provision 
of fourth generation language (4GL) facilities. 


A database language must provide for both logical schema 
specification and modification (data description) and for 
retrieval and update (data manipulation). In some cases, 
particularly products derived from the CODASYL network 
database standard, these aspects are treated distinctly as the 
data description language (DDL) and the data manipulation 
language (DML). Modification to the storage schema is also 
generally separately provided. 


A DBMS is a comprehensive software tool that allows users to 
create, maintain, and manipulate an integrated base of 
business data to produce relevant management information. 
Integrated means the records are logically related to one 
another so that all data on a topic can be retrieved by simple 
requests. The DBMS software represents the interface 
between the user and the computer’s operating system and 
database. The DBMS provides the facilities to: 


e create a database; 

e add, amend and delete data; 

e sort and search a database; 

e create and print reports; 

e perform relational, logical and string operations; 
e modify the database structures. 


A file that contains the details of the data; it contains the rules 
for the use of the database files. 


The facility to modify a database schema (logical or storage 
schema) with no consequent requirement to modify user 
views or programs interacting with the database nor any need 
to reload data. To provide independence has been a main 
motivation for the development of database management 
software. It is a relative term and different products provide 
different levels of data independence. It is particularly 
important for large shared databases that are required to 
evolve in line with user needs. The provision of data 
independence frequently conflicts with the need for efficient 
(i.e. fast) processing and usually necessitates some 
compromise in terms of the software techniques used. 


Logical data independence refers to the facility to change the 
logical schema and thus evolve the content of the database; 
physical data independence refers to the facility to change the 
storage schema and thus modify and improve performance. 
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Resistance to alteration by system errors of data stored in a 
computer. It is a condition that denotes only authorised and 
proper alteration of data. It is a measure of the reliability of 
data read from magnetic media, in terms of the absence of 
undetected errors. 


Perhaps the most common in use today is SQL (Structured 
Query Language) or commonly referred to as a query 
language. Contains a combination of : 


e Data Manipulation Language (DML) is used to access 
the data. 


e Data Description Language (DDL) is used to define 
the data in the database in terms of the fields in the 
files or tables. 


An aspect of data type expressing the nature of values that 
are composite, i.e. not atoms. The nonatomic values have 
constituent parts (which need not themselves be atoms), and 
the data structure expresses how these constituents may be 
combined to form a compound value or selected from a 
compound value. Thus “date” regarded as a data structure is 
a set containing a member for every possible day, combined 
with operations to construct a date from its constituents — 
year, month and day — and to select a desired constituent. 


An implementation of a data structure involves both choosing 
a storage structure and providing a set of procedures / 
functions that implement the appropriate operations using the 
chosen storage structure. Formally, a data structure is 
defined as a distinguished domain in an abstract data type 
that specifies the structure. Computer solution of a real-world 
problem involves designing some ideal data structures, and 
then mapping these onto available data structures (e.g. 
arrays, records, lists, queues and _ trees) for the 
implementation. 


Note that terms for both data structures are used to denote 
both the structure and the data having the structure. See also 
dynamic data structure, static data structure. 


Removing errors from a program. 


A style of analysis or design that relies primarily on the use of 
diagram (as opposed to text or databases). The advantage is 
the direct appeal to users, the disadvantage the limitation to 
two dimensions. See CORE, ERA diagram, JSD, MASCOT, 
Nassi-Shneiderman chart, SADT, SSADM, Yourdon. 
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A counting loop in a program, in which a section of code is 
obeyed repeatedly with a counter taking successive values. 
Thus in Fortran, 


DO 10 | =1,100 
<statements> 
CONTINUE 


causes the <statements> to be obeyed 100 times. The 
current value of the counter variable is often used within the 
loop, especially to index an array. There are many syntactic 
variants: in Pascal and Algol-related languages the same 
basic construct appears as the FOR loop, e.g. 


FOR | := 1 to 100 DO 
BEGIN 
<statements> 
END 


This kind of loop is a constituent of almost all programming 
languages (except APL, which has array operations defined 
as operators in the language). See also DO-WHILE loop. 


A linked list where each item contains links to both its 
predecessor and its successor. This makes it possible to 
traverse the list in either direction. The flexibility given by 
double linking must be offset against the overheads of the 
storage and the setting and resetting of the extra links 
involved when items are inserted or removed. 


A form of programming loop in which the condition for 
termination (continuation) is computed each time around the 
loop. There are several variants on this basic idea. For 
example, Pascal has 


WHILE <condition> DO 
BEGIN 
<statements> 
END 


and also 


REPEAT 
<statement> 
UNTIL <condition> 


The first is a while loop and the second is a repeat until loop. 
Apart from the obvious difference that the first specifies a 
continuation condition while the second specifies a termination 
condition, there is a more significant difference. The while 
loop is a zero-trip loop, i.e. the body will not be executed at all 
if the condition is false the first time around. In contrast, the 
body of a repeat-until loop must be obeyed at least once. 
Similar constructs are found in most languages, though there 
are many syntactic variations. See also DO loop. 
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Extended Binary Coded Decimal Interchange Code. 
See Information hiding. 


An item of data consisting of a number of characters, bytes, 
words, or codes that are treated together, e.g. to form a 
number, a name, or an address. A number of fields make a 
record and the fields may be fixed in length or variable. The 
term came into use with the punched card systems and a field 
size was defined in terms of a number of columns. 


First in — first out. See queue. 


Information held on backing store (i.e. usually on magnetic 
disk or magnetic tape) in order (a) O to enable it to persist 
beyond the time of execution of a single job and / or (b) to 
overcome space limitations in main memory. Files may hold 
data, programs, documents, pictures, or any other information. 
They are referred to by file name. Files with a very brief 
existence (i.e. in case (b) above), or where they simply carry 
information between one job and the next in sequence are 
called work files. See also master file, data file. 


Information that describes a file, giving details such as its file 
name, generation number, date of last access, expiry date, 
and the structure of the records it contains. It is normally 
stored as a header record at the front of the file, held on 
magnetic tape or on disk. 


A software system that provides facilities for file management 
(often specifically of data files) at a level above that offered by 
operating systems (but in the case of data files below that 
offered by database management systems). 


A digit or character in a computer program which is given 
various values (often TRUE/FALSE) to indicate a situation. 
The flag is tested (used) at a later point(s) so that the program 
knows which way to proceed. 


An array whose lower and / or upper bounds are not fixed and 
may vary according to the values assigned to it. See also 
string. 


See parameter. 


A program unit that, given values for input parameters, 
computes a value. Examples include the standard functions 
such as sin(x), cos(x), exp(x); in addition most languages 
permit user-defined functions. A function is a ‘black box’ that 
can be used without any knowledge or understanding of the 
detail of its internal working. In some languages a function 
may have side effects. 
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A term used to describe the scope of an entity: global entities 
are accessible from all parts of a program. By contrast, local 
entities are accessible only in the program module within 
which they are defined. 


The most basic conditional construct in a programming 
language, allowing a selection between two alternatives, 
depending on the truth or falsity of a given condition. Most 
languages also provide an IF ... THEN construct to allow 
conditional execution of a single statement or group of 
statements. Primitive languages, such as BASIC in its original 
form, restrict the facility to a conditional transfer of control, e.g. 
“IF A = 0 THEN 330” which is reminiscent of the conditional 
jump provided in the order code of every CPU. 


A string of characters used to identify (or name) some element 
of a program. The kind of element that can be named 
depends on the programming language; it may be a variable, 
a data structure, a procedure, a statement, a higher-level unit, 
or the program itself. 


A list of values of some particular data item contained in a 
record, enabling it to be retrieved more rapidly than by simple 
serial search. For example, a subscript is a value, usually 
integral, that selects a particular element of an array. The B+ 
tree (see B-tree) is an efficient form of multi-level index. 


This is an important concept in object-oriented systems, and 
takes place in two important ways. Firstly, information about 
objects is hidden from other objects by the system itself, this is 
known as encapsulation. Secondly, the developer can choose 
to hide information about a class or object to ‘streamline’ that 
object to suit their needs. This is known as abstraction. 


An object can inherit from its class and a class may inherit 
from another class. This process is called inheritance. For 
example, a rice cooker, toaster, microwave and electric kettle 
are all classes in their own right, but each class is also a 
member of the kitchen appliances class. The kitchen 
appliance class is the parent or superclass of all the others, 
and these other classes inherit the characteristics of the 
kitchen appliance class; e.g., switch on, switch off, and 
buttons to turn the appliances operations on and off. The rice 
cooker, toaster, microwave and electric kettle are all therefore 
subclasses of the kitchen appliance class. 


Maintenance of error-free and accurate data through the 
detection and removal of errors. 
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The circuitry enabling two devices to interchange data. This 
may be either serial or parallel and is generally standardised. 
However one of the ‘devices’ could be a person. Although an 
object or device hides its operations from the outside world, it 
does allow a means of communicating with the outside world. 
This means of communicating is known as an interface. 
Interfaces are a feature of modern household items, e.g. 
televisions and videos have remote controls to change 
channel and alter volume, microwaves have buttons or dials to 
set temperature and cooking duration, washing machines 
have dials to choose washing cycle. Computers communicate 
with users through an interface, for instance operating 
systems and other software packages that are being used. 


The term used to describe a huge collection of interlinked 
networks which are generally accessible. 


A language processor that analyses a line of code and then 
carries out the specified action, rather than producing a 
machine-code translation to be executed later. 


Repetition of a sequence of instructions where the results 
from one pass of the loop are used as input to the instructions 
in the next pass of the loop. 


A network which uses internet technology but which is 
confined to one company and, in general, is not available to 
public access. 


A programming language used to develop software for the 
Internet or Intranet applications. 


A symbol in a programming language that has a special 
meaning for the compiler or interpreter. For example, 
keywords in BASIC include IF, THEN, PRINT. The keywords 
guide the analysis of the language, and in a simple language 
each keyword causes activation of a specific routine in the 
language processor. 


A syntactic structure or set of structures in a language to 
express a particular class of operations. The term is often 
used as a synonym for control structure. 


The last message, etc, to arrive is the first to be dealt with, i.e. 
messages are formed into a stack with the last arrival being 
put at the top of the stack. See also FIFO, stack. 


A character or group of characters that indicates the storage 
of an item of data. Thus when a field of an item A in a data 
structure contains the address of another item B, i.e. of its first 
word in memory, it contains a link to B. Two items are linked 
when one has a link to the other. See also /inked lists. 
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A list representation in which the items are not necessarily 
sequential in storage. Access is made possible by the use in 
every item of a /ink that contains the address of the next item 
in the list. The last item in the list has a special null link to 
indicate that there are no more items in the list. See also 
doubly linked list, singly linked list. 


A finite ordered sequence of items (X;, X2, X3, .... Xn) where n 
>= 0. If n=O, the list has no elements and is called the null 
list (or empty list). If n > 0, the list has at least one element, 
X1, which is called the head of the list (See also header). The 
list consisting of the remaining items is called the tail of the 
original list. The tail of a null list is the null list, as is the tail of 
a list containing only one element. 


The items in a list can be arbitrary in nature, unless otherwise 
stated. In particular it is possible for an item to be another list, 
in which case it is known as a sub-list. For example, let L be 
the list (A, B, (C, D), E) then the third item of L is the list (C, 
D), which is a sub-list of L. If a list has one or more sub-lists it 
is called a list structure. If it has no sub-lists it is called a 
linear list. The two basic representation forms for lists are 
sequentially allocated lists and linked lists, the latter being 
more flexible. 


A word or symbol in a program that stands for itself rather 
than as a name for something else, i.e. an object whose value 
is determined by its denotation. Numbers are literals; if other 
symbols are used as literals it is necessary to use some form 
of quoting mechanism to distinguish then from variables. 


A term applied to entities that are accessible only in a 
restricted part of a program, typically in a procedure or 
function body. By contrast, non-local entities are accessible in 
a wider scope and global entities are accessible throughout a 
program. The use of local entities can help to resolve naming 
conflicts, and may lead to a more efficient use of memory. 


A means of preventing more than one user simultaneously 
changing data in a common record or file. 


An operation on /ogical values, producing a Boolean result 
(see also Boolean algebra). The operations may be monadic 
or dyadic, and are denoted by symbols known as operators. 
In general there are 16 logic operations over one or two 
operands; they include AND, OR, NOT, NAND, NOR, 
exclusive-OR, and equivalence. 


Logic operations involving more than two operands can 
always be expressed in terms of operations involving one or 
two operands. Those involving two operands can be 
expressed in terms of other operations involving one or two 
operands. 


Logic circuits are fabricated for the implementation of logic 


Glossary 
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operations on their input signals. The inputs may be words (or 
bytes) and the logic operation is applied to each bit in 
accordance with Boolean algebra. 


A data type comprising the logical values TRUE and FALSE, 
with legal operations restricted to logic operations. 


Either of the two values TRUE and FALSE that indicate a truth 
value. Although a single bit is the most obvious computer 
storage structure that can be applied to logical data, larger 
units of store, such as byte, are frequently used in practice 
since they can be addressed distinctly. 


Code introduced into a program to have an undesirable effect 
following the occurrence of some later event. For example, a 
logic bomb may be programmed to destroy valuable data 
should the programmer’s name ever be deleted from the firm’s 
payroll. 


A sequence of instructions that is repeated until a prescribed 
condition, such as agreement with a data element or 
completion of a count, is satisfied. See also DO loop, iteration. 


A two-dimensional array. In computing, matrices are usually 
considered to be special cases of n-dimensional arrays, 
expressed as arrays with two indices. The notation for arrays 
is determined by the programming language. The two 
dimensions of a matrix are known as its rows and columns; a 
matrix with m rows and n columns is said to be an mxn matrix. 


When an object receives a message from another object, it 
activates a method or operation. These messages are also 
known as requests or function calls. The television remote 
control sends a message to the television to perform an 
operation or activate a method such as change channel. This 
process of transmitting a message from the Message Sender 
to the Message Receiver is called message passing. The 
Message Sender does not know how the Message Receiver 
carries out this method as this information is hidden, hence 
the term information hiding. 


Messages may also contain arguments (parameters) to clarify 
the request. The television remote control will tell the 
television to change to a particular channel, to reduce the 
volume by one unit. A microwave can be asked to cook for a 
certain number of minutes, and video to turn itself on at a 
particular time. These are all examples of arguments — data 
passed as part of the message. 


Objects communicate with each other by sending messages 
to one another. An object sends a message to another object 
telling it to perform an operation. This works in the same way 
as remote controls that send messages to objects such as 
televisions to tell them to perform operations such as change 
channel. The message itself only tells the object to perform 
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Glossary 


the operation, not how to perform it. The object itself knows 
how to perform this operation from the class it belongs to. 


There is another fundamental difference between traditional 
programming and object orientation. Data traditionally had 
things done to it, whereas an object can do things. This ability 
to do things is called methods (or may be referred to as 
operations — not strictly accurate, but common practice). 


A style of programming in which the complete program is 
decomposed into a set of components, termed modules, each 
of which is of a manageable size, has a well defined purpose, 
and has a well defined interface for use by other modules. 
Since the only alternative — that of completely monolithic 
programs — is untenable, the point is not whether programs 
should be modular but rather what criteria should be 
employed for their decomposition into modules. This was 
raised by David Parnas, who proposed that one major 
criterion should be that of information hiding. Prior to this, 
decomposition had typically been performed on an ad-hoc 
basis, or sometimes on the basis of stages of the overall 
processing to be carried out by the program, and only minor 
benefits had been gained. More recently there has been great 
emphasis on decomposition based on the use of abstract data 
types and on the use of objects or object orientation; such a 
decomposition can remain consistent with the principles of 
information hiding. 


This is a term that indicates the number of objects that are 
associated with one particular class. These associations can 
be one-to-one or one-to-many. For instance, a wife might 
have one husband; they have a one-to-one association. A 
mother, however, may have many children and she has a 
one-to-many association with those children. Multiplicity is 
very common in the real world; human beings walk on two 
legs (a one-to-two association), dogs walk on four legs (a one- 
to-four association), spiders walk on eight and insects on six. 


A notation for indicating an entity in a program or system. 
(The word can also be used as a verb) The kinds of entity 
that can be named depend on the context, and include 
variables, data objects, functions, types, and procedures (in 
programming languages), nodes, stations, and processes (in 
a data communication network), files, directories, devices (in 
operating systems), etc. The name denotes the entity, 
independently of its physical location or address. Names are 
used for long-term stability (e.g. when specifying a node ina 
computer program) or for their ease of use by humans (who 
recognise the name more readily than an address). Names 
are converted to addresses by a process of name lookup. 


In many languages and systems, a name must be a simple 
identifier, usually a textual string. In more advanced 
languages, a name may be composed from. several 
elementary components according to the rule of the language. 


Glossary 
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An object can be: 

1. Some physical thing in the ‘real world’. 

2. Arepresentation of reality. 

3. A tangible or visible thing. 

4. A thing to which action or thought can be directed. 

5. Passive — doing nothing until activated e.g. Like a switch. 


6. Active — continually monitoring until conditions change e.g. 
Like a thermostat. 


An object is never: 

1. A value (e.g. Name). 
2. processes (e.g. Sort). 
3. time (e.g. 5 minutes). 


With reference to objects. See behaviours. Also called 
responsibilities. 


A relational database management system. 
A UK teletext service. 


Information passed to a subroutine, procedure or function. 
The definition of the procedure is written using formal 
parameters to denote data items that will be provided when 
the subroutine is called, and the call of the procedure includes 
corresponding actual parameters. See also parameter 
passing. 


A single scan through a body of data, for example by a 
compiler reading the program text or a statistical package 
reading its data. 


Practical Extraction and Report Language. 


Personal Identification Number, e.g. used with cards at cash 
machines. 


Often developers will find that operations can have the same 
name even though they are associated with different objects. 
Polymorphism allows developers to re-use terminology and to 
allow it have more than one meaning. This can prove useful 
for a number of reasons. System modellers can talk to clients 
using terms that are familiar to them, and maintain client's 
own terminology. Some operations fall naturally into certain 
terminology such as open and close, and it would be 
preferable to use words that have an obvious meaning. The 


© NCC Education. All rights reserved. Unauthorised duplication is prohibited. 


Programming Methods 


Procedure 


Program 
decomposition 


Programmer 


Program unit 


Prototyping 


Pseudocode 


Pseudolanguage 


Query language 


V1.0 


Glossary 


ability to allow more than one meaning for each operation 
means that the developer can maintain terminologies without 
having to invent a new unique word every time a similar 
operation occurs. 


A section of a program that carries out some well-defined 
operation on data specified by parameters. It can be called 
from anywhere in a program, and different parameters can be 
provided for each call. 


The term procedure is generally used in the context of high 
level languages; in assembly language the word subroutine is 
more commonly employed. 


The breaking down of a complete program into a set of 
component parts, normally called modules. The 
decomposition is guided by a set of design principles or 
criteria that the identified modules should reflect. Since the 
decomposition determines the coarse structure of the 
program, the activity is also referred to as high level or 
architectural design. See also modular programming. 


A person responsible for writing computer programs. 


A constituent part of a large program, and in some sense self- 
contained. 


Prototyping is used to develop a quick implementation of the 
software prior to or during the software requirements phase. 
The client uses the prototype and provides feedback to the 
developers as to its strengths and weaknesses. This feedback 
is used to refine or change the prototype to meet the real 
needs of the customer. Prototyping either can be evolutionary 
or throw away. With the advent of user _ interface 
representation tools that allow quick construction of 
demonstator interfaces this is increasingly viable as an 
approach. 


Another name for pseudolanguage. 


Another name for pseudocode. A form of representation used 
to provide an outline description of the specification for a 
software module. Pseudolanguages contain a mixture of 
natural language expressions embedded in_ syntactic 
structures taken from programming languages (such as IF ... 
THEN ...ELSE). The formality of the definition varies from ad 
hoc (e.g. defined within the project team) to being sufficiently 
formal to enable automatic parsing and syntax checking (e.g. 
supported by a CASE tool). Pseudolanguages are not 
intended to be executed by computer; they must be 
interpreted by people. 


Used with databases. See database sublanguage, database 
manipulation language and data dictionary. 


Glossary 


Queue 
(FIFO list; pushup 
stack; pushup list) 


Queue management 


RAM 


Record 


Relational model 


Report generators 


Requirements Analysis 
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A linear /ist where all insertions are made at one end of the list 
and all removals and accesses at the other end. 


A queue is characterised by the way in which customers (i.e. 
processes) join it in order to wait for service, and by the way in 
which customers already in the queue are selected for 
servicing. Both of these activities are controlled by the queue 
manager. 


Random Access Memory. 


A data structure in which there are a number of named 
components, called fields, not necessarily of the same type. It 
may have variants in which some of the components, known 
as variant fields, are absent; the particular variant for a given 
value would be distinguished by a discriminant or tag field. 
The record is widely recognised as one of the fundamental 
ways of aggregating data (another being the array) and many 
programming languages offer direct support for data objects 
that take the form of records (See structured variable). Such 
languages permit operations upon an entire record object as 
well as upon individual components. 


A data model that views information in a database as a 
collection of distinctly named tables. Each table has a 
specified set of named columns, each column name (also 
called an attribute) being distinct within a particular table, but 
not necessarily between tables. The entries within a particular 
column of a table must be atomic (that is single data items) 
and all of the same type. The logical records held in a 
relational database are viewed as rows in these tables. Each 
logical record is thus constrained to contain only a set of 
elementary data items of a pre-specified type. 


An example of a very early application generator, although 
they are now considered to be a fourth generation tool as are 
query languages. Report generators were designed to read 
and process files and produce reports with the facility to 
provide totals and sub-totals. 


The first stage in the software lifecycle. An accurate and 
complete set of user requirements is produced to define the 
characteristics required for an acceptable solution. This 
information is obtained mainly by direct interviews with current 
and future users of the system. 


A Requirements Analysis document contains the following 
information: 


e A clear understanding between the user and the 
developer over the proposed system or solution. 


e A list of the existing and new tools, facilities and 
people available for developing the solution. 


e Aschedule for the next stages of the project with the 
deliverables for each stage. 
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A word that has a specific role in the context in which it 
occurs, and therefore cannot be used for other purposes. For 
example, in many programming languages the words ‘IP’, 
‘THEN’, ‘ELSE’ are used to organise the presentation of the 
written form of statements (between ‘THEN’ and ‘ELSE’ and 
following ‘ELSE’) whose execution is governed by the value of 
the Boolean expression between ‘IF’ and ‘THEN’. The use of 
if, then or else as identifiers is thus not permitted in these 
languages since they are reserved words. See also keyword. 


A value added at the end of a table and that can be 
recognised as a termination signal by a table lookup program. 


An effect of a program unit that is not apparent from its 
parameters, for example altering a non-local variable or 
performing input/output. 


A linked list in which each item contains a single link to its 
successor. By following links it is possible to access the entire 
structure from the first item. 


The complete lifetime of a software system from. initial 
conception through to final obsolescence. The term is most 
commonly used in contexts where programs are expected to 
have a fairly long useful life, rather than in situations such as 
experimental programming where programs tend to be runa 
few times and then discarded. Traditionally the lifecycle has 
been modelled as a number of successive phases, typically: 


user requirements 

system requirements 

software requirements 

overall design 

detailed design 

component production 
component testing 

integration and system testing 
acceptance testing and release 
Operation and maintenance 


Such a breakdown tends to obscure several important aspects 
of software production, notably the inevitable need for iteration 
around the various lifecycle activities in order to correct errors, 
modify decisions that prove to have been misguided, or reflect 
changes in the overall requirements for the system. It is also 
somewhat confusing to treat operation and maintenance as 
just another lifecycle phase since during this period it may be 
necessary to repeat any or all of the activities required for 
initial development of the system. There has therefore been a 
gradual movement towards more sophisticated models of 
software lifecycle. These provide explicit recognition of 
iteration, and often treat the activities of the operation and 
maintenance period simply as iteration occurring after rather 
than before release of the system for operational use. 


Glossary 


Sorting 


Sort key (key) 


Specification 


Specification language 


Stack 


State 


Stepwise refinement 
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Rearranging information into ascending or descending order 
by means of sort keys. Sorting may be useful in three ways: 
to identify and count all items with the same identification, to 
compare two files, and to assist in searching, as used ina 
dictionary. An internal sorting method keeps the information 
within the computer’s high speed RAM; an external sorting 
method uses backing store. There are a wide variety of 
methods. 


The information, associated with a record of information, that 
is to be compared in a sorting process. It follows that the sort 
keys must be capable of being ordered, i.e. two keys k; and kz 
are such that k, < ko, ki; = Kz and k; > ko. 


A formal description of a system, or a component or module of 
a system, intended as a basis for further development. The 
expression of the specification may be in text in a natural 
language (e.g. English), in a specification language, which 
may be a formal mathematical language, and by the use of 
specification stages of a methodology that includes a 
diagrammatic technique. Characteristics of a good 
specification are that it should be unambiguous, complete, 
verifiable, consistent, modifiable, traceable, and usable after 
development. 


A language that is used in expressing a specification. It has a 
formally defined syntax and semantics, and its design is 
based on a mathematical method for modelling or defining 
systems (e.g. set theory, equations and initial algebras, 
predicate logic). Examples include SADT, RSL, VDM, OBJ 
and Z. 


A linear list where all accesses, insertions and removals are 
made at one end of the list, called the top. This implies 
access on a last in first out (LIFO) basis: the most recently 
inserted item on the list is the first to be removed. The 
operations push and pop refer to the insertion and removal of 
items at the top of the stack. 


An object’s state can be thought of as the things it knows; For 
example an object such as a car might have an engine type, a 
make, model number and a colour. Software objects have 
state and behaviours; these form the structure of the object. 
The object’s state is made up of items of data called attributes 
(or properties) which describe aspects of the object, whilst its 
behaviours are the operations that the object carries out. 


The repetitive breaking down of a procedure until it can be 
programmed in a straightforward way probably in a structured 
language such as Pascal. 


Definition: An approach to software development in which an 
initial highly abstract representation of some required program 
is gradually refined through a sequence of intermediate 
representations to yield a final program in some chosen 
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programming language. The initial representation employs 
notations and abstractions that are appropriate for the 
problem being addressed. Subsequent development then 
proceeds in a sequence of small steps. Each step refines 
some aspect of the representation produced by the previous 
step, thus yielding the next representation of the sequence. 


Typically a single step involves simultaneous refinement of 
both data structures and operations, and is small enough to 
be performed with some confidence that the result is correct. 
Refinement proceeds until the final representation in the 
sequence is expressed entirely in the chosen programming 
language. This approach is normally associated with N. Wirth, 
designer of the Pascal and Modula languages. Compare 
structured programming. 


The mapping from a data structure to its implementation 
(which may be another data structure). Thus a date may be 
represented as a vector of three integers (with six 
permutations to choose from), directly as a _ string of 
characters, or, in more recent high level languages, as a 
record with three selectors — day, month and year. A good 
choice of storage structure permits an easy and efficient 
implementation of a given data structure. 


A sorting algorithm that looks at each sort key in turn, and on 
the basis of this places the record corresponding to the sort 
key correctly with respect to previous sort keys. 


A sorting algorithm based upon finding successively the 
record with the largest sort key and putting it in the correct 
position, then the record with the next largest key, etc. 


A flexible one-dimensional array, i.e. a flexible vector, of 
symbols where the lower bound of the vector is fixed at unity 
but the upper bound, i.e. the string length, may vary. 


A type of input to a graphics system consisting of a sequence 
of characters. The usual input device is a keyboard. See also 
logical input device. 


Any one-dimensional array of characters. In formal language 
theory a string is often referred to as a word. 


Of a program. See program structure. 
The relationship between parts of a compound object. 
See data structure, control structure, storage structure. 


A substitute component that is employed temporarily in a 
program so that progress can be made e.g. with compilation 
or testing, prior to the genuine component becoming available. 
For procedure calls a simple return is often all that is needed, 
although in some cases data may be needed as well. Stubs 
should be as simple as possible. 


Glossary 


Subroutine 


Subscript 


Syntax (syntax rules) 


Systems analysis 


Systems Analysts 
Table 


Top-down 
development 
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A piece of code which is obeyed ‘out of line’, i.e. control is 
transferred to the subroutine, and on its completion control 
reverts to the instruction following the cal/. (The instruction 
code of the CPU usually provides subroutine jump and return 
instructions to facilitate this operation.) A subroutine saves 
space since it occurs only once in the program, though it may 
be called from many different places in the program. It also 
facilitates the construction of large programs since 
subroutines can be formed into libraries for general use. (The 
same concept appears in high level languages as the 
procedure). 


A means of referring to particular elements in an ordered 
collection of elements. For example, if R denotes such a 
collection of names then the ith name in the collection may be 
referenced by R; (i.e. R subscript i). This printed form is the 
origin of the term but it is also used when the subscript is 
written on the same line, usually in parentheses or brackets: R 
(i) orR[i]. See also index, array. 


The rules defining the legal sequences of symbolic elements 
in a language. The syntax rules define the form of the various 
constructs in the language, but say nothing about the meaning 
of these constructs. Examples of constructs are: expressions, 
procedures, and programs (in the case of programming 
languages) and terms, well formed formulas, and sentences 
(in the case of logical languages). 


The analysis of the role of a proposed system and the 
identification of a set of requirements that the system should 
meet, and thus the starting point for systems design. The 
term is most commonly used in the context of commercial 
programming, where those involved in software development 
are often classed as either systems analysts or programmers. 
The systems analysts are responsible for identifying a set of 
requirements (i.e. systems analysis) and producing a design. 
The design is then passed to the programmers, who are 
responsible for the actual implementation of the system. 


See systems analysis. 


A collection of records. Each record may store information 
associated with a key by which specific records are found, or 
the records may be arranged in an array so that the index is 
the key. In commercial applications the word table is often 
used as a synonym for matrix or array. 


An approach to program development in which progress is 
made by defining required elements in terms of more basic 
elements, beginning with the required program and ending 
when the implementation language is reached. At every stage 
during top-down development each of the undefined elements 
from the previous stage is defined. In order to do this, an 
appropriate collection of more basic elements is introduced, 
and the undefined elements are defined in terms of these 
more basic elements (more basic meaning that the element is 
closer to the level that can be directly expressed in the 
implementation language). These more basic elements will in 
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turn be defined at the next stage in terms of still more basic 
elements, and so on until at some stage the elements can be 
defined directly in the implementation language. 


In practice, pure top-down development is not possible, the 
choice of more basic elements at each stage must always be 
guided by an awareness of the facilities of the implementation 
language, and even then it will often be discovered at a later 
stage that some earlier choice was inappropriate, leading to 
the need for iteration. 


A testing / debugging aid. The trace package offered by most 
compilers and interpreters allows a programmer to watch the 
program execute by displaying the name of the current 
module or the number of the current statement under 
execution. Some packages also allow the programmer to 
specify certain variables, which are then displayed whenever 
their values change. 


A sentinel that occurs at the end of data organised in 
sequential form, e.g. on magnetic tape. Trailer labels typically 
include summary statistics of the data, e.g. the total number of 
records in a file. 


This contains the record of activity that affects the data in a 
database during a transaction. It is used to backup databases 
and for rebuilding files if they become damaged or destroyed. 
It is straightforward to use for backups as all transactions are 
recorded and the previous day’s copy of the database is 
considered to be the current one, which is then updated using 
the transaction log. 


They have two subscripts to identify an element of the array. 
They can be thought of as the rows and columns in a table. 


A language and method for developing object oriented 
applications. 


The ease with which a user can use a system to carry out its 
functions. 


A qualitative term applied to interactive systems (hardware 
plus software) that are designed to make the user’s task as 
easy as possible by providing feedback. Ways that help to 
make a system user-friendly include: 
e list of valid commands available on request; 
e use of a graphical user interface; 
e ability to undo actions made in error or by accident; 
e use of graphics and colour to indicate what is 
happening; 
e availability of a help system giving information 
appropriate to the current situation; 


e choice of interaction methods to suit personal 
preference and level of expertise; 


Glossary 


Variable 


Validation and 
Verification (V & V) 


Walkthrough 


White box testing 


WYSIWYG 


Word (machine word; 
computer word) 


Programming Methods 


e immediate verification of data input, such as checking 
that a number is in the correct range or by word-by- 
word spell checking; 


A unit of storage that can be modified during program 
execution, usually by assignment or read operations. A 
variable is generally denoted by an identifier or by aname. 


The name that denotes a modifiable unit of storage. 


A generic term for the complete range of checks that are 
performed on a system in order to increase confidence that 
the system is suitable for its intended purpose. Although a 
precise distinction is not always drawn, the verification aspect 
normally refers to completely objective checking of conformity 
to some well-defined specification, for example, to compare a 
second transcription with the first as when password changes 
are made. The validation aspect normally refers to the checks 
built into a computer program which provide the program with 
powers of judgement as to the suitability of the data. These 
validation checks can be provided at the input stage, when 
data is checked to prevent being passed for processing, or the 
update stage when the consistency of the input data with 
stored data is checked. 


A product review performed by a formal team. A number of 
such reviews may be held during the lifetime of a software 
project, covering, for example, requirements, specification, 
design and implementation. The review is formally constituted; 
there is a clear statement of the contribution that each 
member of the review team is required to make, and a step by 
step procedure for carrying out the review. The person 
responsible for development of the product walks through the 
product for the benefit of the other reviewers, and the product 
is then openly debated with a view to uncovering problems or 
identifying desirable improvements. 


Based on knowledge of the internal logic of an application’s 
code. Tests are based on coverage of code statements, 
branches, paths, conditions. White box testing aims to ensure 
that every part of the program has been activated during 
testing so that no logic bombs are left. See black box testing, 
logic bombs. 


What You See Is What You Get. For example, a word 
processed document appearing on screen exactly as it would 
appear on paper. 


A vector of bits that is treated as a unit by the computer 
hardware. The number of bits, referred to as the word length 
or word size, is now usually 16 or 32. The memory of a 
computer is divided into words (and possibly subdivided into 
bytes). A word is usually long enough to contain an instruction 
or an integer. 
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